'We Sing Our Lies Through Empty Sounds:' Hidden Voices in Gothic Music

Vivien Leanne Saunders



This essay has three objectives. First, I will identify and construct a working definition of non-lyrical, yet narrative songs which for the sake of clarity within this essay I shall term ‘inarticulate’ works.1 Second, I will discuss how both inarticulate works and their analyses can be linked to features of the contemporary Gothic genre. Finally, I will explore this connection through a case study of Five Years by Sugar Hiccup. It is my argument that our investment in the hidden voice of the singing narrator in these inarticulate works can affect our reading of character, plot, emotion and truth. Our precarious relationship with these untrustworthy voices gives rise to the experiences of displacement and the uncanny which are symptomatic of the Gothic.

Inarticulate Vocal Music

If music descends from language, why are we so mute? (Jourdain 277)

The song Five Years by the contemporary Gothic rock band Sugar Hiccup (Oracle 1995) narrates a story of a man and his abandoned lover. The woman becomes convinced that he will never return to her, and ultimately surrenders to her grief. As the song develops, it describes the hope, desperate passion and, finally, the crushing loss that the woman feels. In less than three minutes of music, Sugar Hiccup effectively portray every one of the eponymous five years drifting past. This narrative is hardly exceptional; the story of the abandoned lover is so common it is almost a cliché. But Five Years sets itself apart from the norm by the unconventionality of its use of narrative devices. In those three minutes, exactly five words are uttered by the singer. The rest of the vocalisation consists of humming, fragmented sounds from the few coherent words, and, finally, screaming. The musical devices are just as restrained as the lyric, with little harmonic or melodic development throughout the performance. How, then, do Sugar Hiccup manage to tell us any story at all? How might we understand the nuances of this work, and – perhaps most importantly – how might this give us insight into how to approach any narrative constructed by deliberately subverting all of our expectations?

Human beings are capable of producing semantically coherent words. Removing these words but retaining the human voice, such as in Five Years, means that our attention is redirected. We have to consciously interpret the expressive features of the voice (whether these are performative choices or part of the notated/dictated work) in order to understand the narrative. This interpretation is not an unusual or even difficult process: humming, screaming, laughing and sobbing are all wordless sounds whose meaning we can interpret easily, if not entirely accurately, and all are commonly employed in contemporary Gothic music, even as it transcends traditional genre definition.2

In musical analysis we might order the coherence of these kinds of wordless signifier as they move away from the literal use of speech sounds towards the metaphorical: from laughter, to scored uses of the morpheme /ha/, to lilting semitones forming a representative topic or expressive motif. Through connection, association or representation we can interpret sounds as communicative symbols. In this scenario, the need for interpretation does not necessarily interfere with our understanding of the narrative.

Of course, the more ambiguous these symbols become, the more we rely on some form of explicit explanation within the work. Mozart’s Queen of the Night laughs maniacally through her virtuosic cadenzas, but we know that at any moment she will return to normal speech and tell us why (1791). Lucia sings nonsensical vowels because she is mad, but she kindly explains her delusions to us before the closing refrain (Donizetti). The Animals may begin their description of the House of the Rising Sun (1964) with a wordless scream of anguish, but it only takes four lines of verse for them to justify the outburst. In all three cases wordless extremes of emotion are justified by their juxtaposition to the words that surround them. 

This association system, while functional, is not without its flaws. As Robert Jourdain points out:

What constitutes a word in ordinary music? Is it an individual note? A grouping of notes? Speech sounds like… ‘ah’ have no meaning until combined into words, and then their meaning is very stable… A single D-flat can stand as an entire musical assertion in one context, yet in another it makes sense only as part of a musical figure. (276)

The musicologist Leonard B. Meyer argues the same event in structural terms:

Since musical structures are architectonic, a particular sound stimulus which was considered to be a sound term or musical gesture on one architectonic level will… no longer function or be understood as a sound term in its own right. In other words, the sound stimulus which was formerly a sound term can also be viewed as a part of a larger structure in which it does not form independent probability relations with other sound terms. (1956 47)

Identical musical devices – even speech sounds - are thus capable of representing entirely different things within the same musical work in the space of a few bars, let alone when compared to larger intramusical systems or other musical works. We must not seek to find any empirical definition of a musical device, but rather to contextualise the meaning of the musical devices by their relation to the words and devices that surround them.

As Jourdain tells us, lyrical works (those ‘combined into words’) often provide us with a stable interpretation. Inarticulate music distinguishes itself from this practice. We are presented with a work that denies us this stability– whilst still teasing us with its potential to occur – and we encounter issues of meaning. These issues would not occur in a wholly instrumental work, nor one possessing an empirical lyric or libretto. As much as we might describe the violin or the clarinet as possessing a ‘voice’, we do not delude ourselves that such instruments ever actually form concrete words. We listen to the clarinet with no expectation that it will ever form words, and as such we are able to make sense of the music without reference to vocal language. However, works which can be identified as ‘inarticulate’ (as listed in the footnote) are taken from traditions (that is, lyric music) that heavily rely on the word-music binary to form communicative sense.

Sugar Hiccup do not rely on the communicative traits of standardised language in Five Years. They provide us with no stability. They never explain to us what happened to their protagonists, nor do they outline any narrative beyond any emotional depiction the performer enacts. We cannot group the sonic properties of the piece into a stable meaning, as Jourdain would have us do, yet we equally cannot ignore the impact of the lyric, however inarticulate it may be. So how might we interpret what it communicates to us? Immanuel Kant describes music as something which:

…speaks by means of pure sensations without concepts, and so does not, like poetry, leave something over for reflection, yet it moves the mind more variously and, though fleetingly, with more fervour; but it is certainly more enjoyment than culture (the neighbouring thought-play excited by its means is merely the effect of a sort of mechanical association). (339)

We can see in Kant’s statement the skeleton of an argument which we might begin to use to address our inarticulate works. Five Years is undoubtedly emotional, utilising and expanding on the very trait that Kant claims music is devoted to. We could even argue that music is the most logical and effective media to project emotional narratives such as Five Years to an audience. Since (in Kant’s definition) music is formally predisposed to appeal to the emotions, and text is not, removing the textual component of a song will bring its musical, emotionally evocative traits to the foreground, communicating them more efficiently than poetry or literary text. However, this does not explain the effect of the deliberate removal of coherent language from the work. It only asserts the effect of its absence, and is an argument we might as well apply to a purely instrumental work such as a piano concerto.

As the musicologist Lawrence Kramer points out, however, music does possess some properties which indicate that it can communicate something other than emotion: ‘Kant’s phrase ‘leave something over for reflection,’ … quietly points up the weakness in the formalist attitude. Where does this incitement to reflection come from when language is in question?’ (1990 3) Like Kramer, I would like to argue that music does communicate. True, it cannot tell stories in the same manner as other narrative arts, but that hardly means it does not narrate. Rather, this suggests that it does so through the demands of its own language. Where a painting or a film can explicitly show, music must suggest. What we can describe in a novel, we must represent in a score. 

In Onega and Landa’s collection Narratology, narrative arts are described as any forms that convey and represent information to us in a temporal and causal way. I have no wish to argue that this is a finite definition of the much-contested term, but rather that Onega and Landa provide us with the most appropriate perspective to take in this study.  Crucially, this definition makes no demands for many of Kant’s ‘conceptual’ elements, which might be considered vital in the analysis of a written or visual work: i.e. the nature of any characters or narrators, or descriptions of scene or landscape. This means that music can be classified as a narrative art, and that we can assume it is communicating a narrative through the system of its temporal and causal features, if not through any objective truth claims or concepts, as Kant suggests. Please note that this essay makes no claims towards the extensive discourses of musical meaning, metaphor, or semiotic systems, but is rather taking this perspective of musical communicativeness to allow us to focus on the specific dialectic in the text/music binary of inarticulate works.

So, by this definition, music possesses some communicative system. This might be emotional or more explicit. The text of a lyric has another distinct system, allowing for the construction of empirical claims and conceits. My argument is that, since it is deliberately placed between these two narrative systems, inarticulate music effectively subverts our expectations of both. Michael Jenne argued in Music, Communication, Ideology, that ‘the occurrence of a communication system requires of the partners involved the mastery of the appropriate code system’ (59). The emotional construction of music can be considered such a pattern. As Zentner tells us:

Since emotions require an intentional object and music does not provide such objects, specific emotions cannot be felt in response to music. Psychologists and neuroscientists… have relied on… chiefly basic emotion theory, or the circumplex model of affect. Basic emotion theory posits that all emotions can be derived from a limited set of universal and innate basic emotions. (102)

To this end, once again we are denied any empirical comprehension of the musical meaning; we cannot portray ‘specific’ emotions, yet we can determine where in the subset of basic emotions our comprehension is supposed to lie. In some ways this determination might enhance the emotional impact of the work: the conventions of structural features (underlying patterns) and suprasegmental codes (surface level manipulations) are often identified as formative features of salient musical emotion (Scherer & Zentner; Juslin et al).

This emotional construction is the communicative convention in which inarticulate works become effective. In these works, once we are comfortable with one of the established systems - or, at the very least, complacent in our expectations of what the system ought to be – inarticulate works tear it away from us. We are not gradually presented with a secondary system, but instead left with the lingering anticipation that our familiar form might return, and that we can return to a more comfortable, passive engagement with the artwork.

Our analysis of inarticulate works should not defer to the unexplainable features of wordless refrains, but instead draw out the conflict between our existing expectations (suggested by either our predisposition or the work itself) and the deliberate (and conscious) withholding of meaningful words within the musical narrative.  In this struggle between the expected and its subversions, we are forced to find meaning in the idiosyncratic, in the strange, and in the uncanny. Simply put, in inarticulate works we are forced to engage consciously with Gothic devices, in order to comprehend the narrative at all.

The lack of empirical narrative information within a musical work becomes especially relevant when trying to define something which is as notoriously ambiguous as the Gothic: a field Pinker describes as victim of the ‘inexplicable oddities of the arts’ defined by ‘words that are both period labels and terms of abuse’ (Pinker 126). As Chris Baldick describes it, the Gothic plays with ‘inherited confusions’ derived from a ‘common source’. It is no coincidence, either, that his criteria draw from words such as ‘fearful’ and ‘sickening’ – the Gothic is an evocative and impressive genre, much suited to the emotional traits of music (Baldick xi).  Gothic theorist Fred Botting clarifies this, telling us that the genre typically ‘evoke[s] excessive emotion’ (Botting 4). This suggests that, in our Kantian understanding of it, music is the ideal media in which to discover the Gothic. The emotional traits of music may trounce the conceptual in a general narrative reading, but since emotion is a core component of the Gothic then this means that, if anything, the genre should be more explicit in music than in text.

To this end, if we remove the empirical narrative component from our categorisation, the general discussion of the Gothic as described by theorists such as Botting, Baldick and van Elferen draws from a list of (often conflicting) features. These suggest rather than define Gothic criteria. They can range from brash heavy metal timbres to the ethereal singing of disembodied infant voices. Due to the literature-based tradition of the genre, it remains that many of the features that Baldick describes are more fitting to the written text than to music. This is especially true of descriptive narrative sections. We cannot, for example, listen to the haunting sound of Edgar Holst’s Egdon Heath and identify enough information in it to describe it as a ghost story. We cannot even claim that it is using any conventionally evocative tropes that we might find in an eerie story. There can be no characters, nor revelations of dark secrets or skeletons emerging from closets in a work describing nothing other than its own soundscape. However, the dark timbre of that work as the lower strings play against the upper wind, and the unsettled harmony throughout, give us a lack of closure and a sense of endless, haunted space that truly befits the ghostly countryside of Hardy’s landscapes, for which the piece is named.

It is my suggestion that we combine our identification of Kantian affective emotional features with an examination of our responses to the anticipated norms and their subversions. Botting describes the Gothic as ‘producing emotional effects on its readers rather than developing a rational or properly cultivated response’ (Botting 4) by using ‘transgression… as an interrogation of received rules or values’ (Botting 8). If the inarticulate works do indeed create a conflict of interpretation then they will fit neatly into this description. The fact that inarticulate works deliberately subvert the standard rules of communication means that we can make the argument that bands such as Sugar Hiccup are intentionally evoking Gothic effects in their audience, and hence leading them to irrational conclusions. This perspective effectively provides us with a framework for discussing the Gothic genre as enacted in these works as an empirical conceit, rather than as an abstract idea.

From this discussion, therefore, we should take these main points: That music is a narrative art form, using an emotional (Kant), temporal and causal communicative system; that we have prerequisite knowledge of this system, and anticipate how it will develop; that inarticulate music subverts this anticipation, creating a distancing, uncanny effect; that inarticulate music’s subversion can be directly linked to the Gothic.

Five Years, Cio-Cio-San and Other Liars

Five Years (Sugar Hiccup Oracle 1995) presents us with an interesting challenge. The song, including its title, consists of exactly one sentence: ‘Five years, but he will never be back.’ This simple statement is then deliberately fragmented:

‘(Five years) he(ee) will never be ba....aa....aa..’

The composition of the piece is deceptively simple. A ponderous bass guitar drifts between Fm and Cm. It is accompanied by a steel string guitar accenting the ends of phrases with Cm9 chords. Singer Melody del Mundo enters after four bars by humming a melody in Cm. This is very simple, moving in intervals of thirds or by steps, and although it is ornamented in some phrases it never loses this simplicity. It becomes a sequence, repeated throughout the entire work and never becoming more harmonically complex (Figure 1).

Every time the melodic sequence increases in pitch, the singer’s phoneme changes. The single line of lyric begins, ‘but he will never be back’. However, the singer never articulates the final word; the final /a/ turns into a scream, falling back into the same melody as she rises through another octave. The song ends on a protracted shriek. This is the same melody we opened with, undergoing an alteration entirely to do with the physicality of the singer, mutating the segmental properties of the predictable work through suprasegmental development. Scherer and Zentner define this as the ‘systematic configurational changes in sound sequences over time, such as intonation and amplitude contours in speech’ (364). However, Five Years subverts the norm: there is nothing systematic about this development, as it is presented without any structural context besides its own growing mutation from closed-mouthed singing to open-mouthed screaming: a gentle, hopeful hum into a piercing shriek.

Issue 4 Article 7 Image 1: Basic Melody Shape in Five Years

Figure 1: Basic Melodic Shape in Five Years (bb. 5 - 12) Transcription by VLS

The repetitive nature of the accompanying instruments brings the subtle nuances of Mundo’s voice into sharp focus. We become keenly attuned to every hint of emotion and every flaw of articulation and tuning. A similar effect can be found in the soundtrack from Pan’s Labyrinth: the piece Mercedes’ Lullaby (Navarette 2006) requires actress Maribel Verdü to hum a repetitive tune as part of a lush orchestral score. Unlike Five Years, the humming outlines a complete melody (with conventional structural divisions) and is not a catalyst for the development of the work. Instead, growing intensity is created by the expressive affectations of the orchestra, while the humming stays constant throughout.

The nuances of Verdü’s singing are not dictated by the demands of the music, but mirror the film from which the track is taken: towards the end of the song, the actress begins to cry. Where Mundo’s emotion is representative, Verdü’s is very literal. In these last few bars the music draws from the filmic narrative, explicitly drawing us towards the diegesis and away from our intramusical comprehension. Verdü becomes Mercedes, grieving for a dying child. The effect is jarring. Having established the singer as a simple instrument within the ensemble, and not drawing her out as a vital determinant of the narrative, the sudden switch forces us to refocus and establish a new sense of order in our listening practice. As a piece of narrative music it is flawed, simply because of this moment. There is no justification, no closure or completion. Meaning is intentionally extramusical, and our recourse is clear: if we wish to understand the narrative, then we must watch the film.

We do not have this issue with Mundo; from the outset it is clear that her voice is the driving force of the work, and its organic growth is wholly predictable, if a little stylistically unconventional. In a very Kantian manner, Five Years revels in its extremes of emotion, but the effect is achieved intra-musically. This is not a unique trait for the piece, coming as it does from a long traditional of lyric music. However, it is my argument that Sugar Hiccup’s use of this technique subverts the conventional form of emotional expression found in lyric music, as I will demonstrate by comparing it to the operatic aria ‘Un bel di’. This comparison should illuminate some of the differences between an articulated narrative (in an operatic form) and an inarticulate equivalent.

Lawrence Kramer describes the point in this aria from Giacomo Puccini’s Madama Butterfly (1904) where the singer, performing the role of Cio-Cio-San, becomes increasingly inarticulate, despite the lyrical line being ‘initially unbroken… autonomous in its imaginary gratification’ (1997 126). As he goes on to tell us, ‘by the end of the aria, her lyricism has become hysterical… Her closing words… mean just the opposite of what they say; they project the anguish of a loss already suffered but not yet avowed, and this not only in the strident agitation of the orchestra, but also in the vocal stridency needed to make the words heard above that agitation’ (Ibid.) Like Cio-Cio-San, Mundo becomes increasingly hysterical and strident, employing the full range of her voice, and as with the Puccini, this is not entirely done for symbolic effect, but also fits the technical demands of the score. It would be physically impossible for any singer to hum at the high pitch which the piece ends on, nor to scream at the deep rich level which the opening stanzas demand.

Fundamentally, both works are presenting us with the same basic story. Each singer refers to a man who has not returned for several years. However, while Cio-Cio-San insists that he will return and that her ‘faith is unshakable,’ Mundo declares the opposite. In many other respects the musical narratives, while not stylistically similar, are analogous. This is true in terms of structure, initial lyricism, building hysteria, the directed focus on the performance of the soloist, the breakdown of textual coherence, and so on. How might this similarity fit in with our understanding of the overall narrative, given the polarised texts?

If Kramer is correct in claiming that Cio-Cio-San is contradicting herself, revealing to the listener her unspoken despair of her husband’s return, then it follows that Mundo is presenting us with the opposite. If we believe her claim that ‘he will never come back,’ then her story is hardly compelling. It is the absent undercurrent of hope which fuels her story, giving richness to the narrative and demanding our empathy. Using an inversion of Puccini’s technique, Five Years paints a story of such hopelessness that our only recourse is to project the missing voice into it: hope. Kramer glibly comments that his reading of ‘Un bel di’ is an automatic response to so much emotion: ‘Sadism certainly demands this story. However compassionate it may feel, the audience occupies a sadistic position here’ (1997 126). Similarly, we are encouraged to contradict the excessive emotion in Five Years. Compassion must overcome the agonies of Mundo’s declamation, or the piece becomes genuinely uncomfortable to listen to. We move from sharing in a woman’s story to celebrating her emotional torture with no recourse to any hope of closure or redemption. If we do not contradict the blatant narrative and instead take the song at face value, then we are left with nothing to engage us.

I previously mentioned that we have to deliberately refer to the Gothic devices in these works, in order to understand them. In this instance, we are left with a decidedly unreliable narrator, and our response to this unreliability is to refer directly to the nuances of the voice, and not the form or lyric. We do not trust Mundo, but through her untrustworthiness we are convinced that her story is compelling

By arguing with our expectations and detaching ourselves from our implicit connection with the woman’s voice, we can reconstruct the story with the kind of conclusions Baldick suggested were wholly Gothic: this is a story which leads us through time, giving us a sense of some hidden origin story at the outset of the work and leading us to an undisclosed, horrific conclusion by the finale. This is a work with no context or setting; removed from any sense of place or space, we are left drifting in an empty landscape. ‘Gothic music always represents haunting,’ as van Elferen tells us; Five Years’ landscape is haunted by an invisible woman whose untrustworthy storydrags us irrisistably into her own journey (van Elferen 6). We must engage with this ghostly presence to find our path, and we must try to understand her story to find our bearings, but as the work closes we can understand nothing beside the fact that we are drawn irresistably towards the woman’s screams. 


1 The term ‘inarticulate’ is not an empirical term, but is rather used throughout this essay for the sake of clarifying the difference between the works under scrutiny and those  whose vocal lines remain fully articulate within the music-lyric tradition of pop and rock music.  It refers to works that suggest that a comprehensive lyric is being deliberately withheld from the audience. It is not intended to reflect the performers’ ability to communicate, but the deliberate construction and presentation of narrative within the work itself.

2 As a very sweeping overview of examples: in the pop/rock/metal genre we might find works such as  Bells and My Shadow (Sugar Hiccup Womb 1998), And She Sang (Puppini Sisters The Rise and Fall of Ruby Woo 2007)  or Even in Death (Evanescence Origin 2000) ; in film soundtracks we might consider Suspiria (Goblin) or Mercedes Lullabye (Navarette) ; in video game soundtracks we find Room of Angel (Yamaoka and McGlynn) or the Dear Esther soundtrack (Curry), and so on. This essay situates Five Years within the pop tradition of lyric-song, and within the (less rigorously defined) contemporary Gothic, as described by writers such as van Elferen (2012).

