You are on page 1of 16

University of Illinois Press

Nor the Eye Filled with Seeing: The Sound of Vision in Film
Author(s): Stan Link
Source: American Music, Vol. 22, No. 1 (Spring, 2004), pp. 76-90
Published by: University of Illinois Press
Stable URL: http://www.jstor.org/stable/3592968
Accessed: 21-10-2015 01:44 UTC

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.

University of Illinois Press is collaborating with JSTOR to digitize, preserve and extend access to American Music.

http://www.jstor.org

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

STAN LINK

the

Nor

The

Eye
Sound

with

Filled
of

Vision

Seeing:
in

Film

The increasingly sophisticated use of sound and music in film has not
yet dispelled the notion that cinema is an essentially visual medium.
But whether or not film's primary address is to the eye, its visuality
has two faces. On the one hand, there is the sense in which film presents images to be seen: it captures objects for display. But on the other, we may also encounter cinema's visuality in its presentation of seeing for display: there are clearly ways in which cinema's techniques,
signs, and images become place-holders for vision itself. In other
words, along with objects, film presents modes of visual attention for
display. The point-of-view shot is perhaps the most obvious. Regardless of the object observed, the character's vision itself becomes visible. The visuality of film resides in its own looking, as well as in its
being looked at.
This active visuality typically functions as a narrative vehicle. Seeing becomes a nearly transparent part of how filmic characters take
in their world and, ironically, vision usually disappears into characters and plots. But what of a film in which a character stares through
a peephole, or stands and looks at paintings? In such instances vision becomes a narrative event: seeing has made the leap from style
to story. If only for a moment, vision may trump image. Promoted
from medium to content, seeing becomes action. As such, seeing becomes a candidate for the most evocative elaboration undergone by
any other type of film action-namely, it can become the topic of the
Stan Linkis the assistantprofessorof the Composition,Philosophy,and Analysis of Music at VanderbiltUniversity's BlairSchool of Music. He is currently completing the "Horrorand Science Fiction" chapter for the Cambridge
Companionto FilmMusic.His acoustic and electro-acousticmusical works are
performed in the United States, Europe, and Australia.
American Music
Spring 2004
? 2004 by the Board of Trustees of the University of Illinois

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

The Sound of Vision in Film

77

soundtrack's characterization of screen activity. Cinema asks a question worthy of Alice to the Caterpillar: What does seeing sound like?
Theories of how soundtracks comment on particular images will
not provide an answer. Vision is not a concrete thing, nor is it an emotional act in and of itself. Typical metaphoric referents such as tempo, rhythm and affect can be made to refer to objects and motion on
the one hand, and subjective reactions on the other. But to what can
sound and music meaningfully refer when vision itself becomes the
salient action? Numerous instances may suggest that vision is, literally, not unheard of. And yet, what is there to encode? What are the
referents and how are they invoked? By way of relevant examples and
conceptualization in terms of the body and culture, this essay addresses cinema's aural encounters with vision as action.

Of SoundMind and Body


Nagging realities of artery-clogging popcorn, sticky floors, and fivedollar sodas aside, taking a seat in a movie theater is to take an alternate subjectivity. There may be film-theoretical debate as to the mechanisms, but the notion that cinema places subjectivity into play is
beyond serious dispute. Suturing mechanisms like the point-of-view
shot, for example, transform the "eye of the camera" into the constructed (capital) "I" of another identity. Similarly, a musical score's
affective strategy is potentially an erosion of the personal definition,
location, and source of emotion. Cinematic spectatorship becomes a
modulation of identity. By being its audience we can paradoxically
cease to be its audience and become its subject.
With subjectivity as a playing field, cognition becomes a necessary
model for interpreting many filmic elements. Apart from abstract,
technical, or historical viewpoints, we can approach cinema as simulations and representations of consciousness. For example, Bernard
Herrmann's breakfast sequence in Citizen Kane is not simply a retelling by way of a theme and variations. It becomes a musical incarnation of reflection and commentary implying a sensitive and sentient
presence. Similarly, while it may be musically instructive to consider
John Williams's Indiana Jones theme as a Wagnerian leitmotif, centering an interpretation on cognition suggests that the thematic strategies of a Williams score develop into the presence of a third-person
consciousness possessed of memory and prescience, the ability to explain, anticipate, and to comment. Our invisible companion whispers:
"See, he got away with it again, the lucky bastard."
Along with "film-as-mind," however, we cannot ignore the cinematic body. Of course, the geometry of many visual perspectives as
well as the presence of narrative characters offer fairly specific loca-

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

78

Link

tions for our identifications. But the filmic body leaves impressions
deeper than the symbolic or representational. The filmic subject is
experienced as though it is embodied. The film body's contact with its
world leaves traces that translate as contact with our own world. As
Vivian Sobchack so eloquently describes it,
more than any other medium of human communication, the moving picture makes itself sensuously and sensibly manifest as the
expression of experience by experience. A film is an act of seeing that makes itself seen, an act of hearing that makes itself
heard, an act of physical and reflective movement that makes itself reflexively felt and understood. Objectively projected, visibly and audibly expressed before us, the film's activity of seeing, hearing, and moving signifies in a pervasive, primary, and
embodied language that precedes and provides the grounds for
the secondary significations of a more discrete, systematic, less
"wild" communication.'
Thus we can expect filmic subjectivity to evoke not only cognition,
but perception. The intensity of the landing scene in Steven Spielberg's Saving Private Ryan (1998), for example, derives in no small part
from the fact that its acts of seeing and hearing don't belong simply
to an omniscient and disembodied camera and microphone. Instead,
this scene references physical embodiment and all its limitations of
perspective and scope-eyes and ears rather than lenses and tape recorders. The sequence overflows with corporeal moments: the splash
of a body submerging in the ocean muffles its ear until it resurfaces;
the explosion of a mortar shell causes temporary deafness; the metallic rain of metal shell casings ejecting onto the concrete floor of the
German bunker places the machinegun within reach. The soundtrack
is on a human scale. The distance is personal. Sound becomes haptic, touching at arm's length.
We "experience" these sequences via their circumscription to the
most immediate physiological dimension. "Wild" communication is
sensed paradoxically through its utter austerity of scope. The sound
and image in Saving Private Ryan crystallize Sobchack's idea that "cinema thus transposes, without completely transforming, those modes
of being alive and consciously embodied in the world that count for
each of us as direct experience: as experience 'centered' in that particular, situated, and solely occupied existence sensed first as 'Here,
where the world touches' and then as 'Here, where the world is sensible; here, where I am."'2

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

The Sound of Vision in Film

79

The Ear World and the Eye World


Such embodied cinematic senses would naturally seem bound to
their real world counterparts. From the soundtrack, of course, we
would expect to process information aurally, and from the picture
in accordance with
we expect to process information visually-each
the perceptual implications of that sense. And such implications are
described consistently by a diversity of authors. The ear, writes Marshall McLuhan,
favors no particular "point of view." We are enveloped by sound.
It forms a seamless web around us. We say, "Music shall fill the
air." We never say, "Music shall fill a particular segment of the
air." We hear sounds from everywhere, without ever having to
focus. Sounds come from "above," from "below," from in "front"
of us, from "behind" us, from our "right," from our "left." We
can't shut out sound automatically. We simply are not equipped
with earlids. Where a visual space is an organized continuum of
a uniformed connected kind, the ear world is a world of simultaneous relationships.3
A characteristic feature of vision, on the other hand, is its selectivity.
As Rudolf Arnheim describes it,
in looking at an object we reach out for it. With an invisible finger
we move through the space around us, go out to the distant
places where things are found, touch them, catch them, scan their
surfaces, trace their borders, explore their texture. It is an eminently active occupation. Impressed by this experience, early
thinkers described the physical process of vision correspondingly. For example, Plato, in his Timaeus, asserts that the gentle fire
that warms the human body flows out through the eyes in a
smooth and dense stream of light.4
McLuhan and Arnheim are affirmed even in the more political and
sociological thrust of Adorno and Eisler's thoughts on the matter. In
Composingfor the Films they observe that
the human ear has not adapted itself to the bourgeois rational
and, ultimately, highly industrialized order as readily as the eye,
which has become accustomed to conceiving reality as made up
of separate things, commodities, objects that can be modified by
practical activity. Ordinary listening, as compared to seeing, is
"archaic"; it has not kept pace with technological progress. One
might say that to react with the ear, which is fundamentally a
passive organ in contrast to the swift, actively selective eye, is in

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

80

Link
a sense not in keeping with the present advanced industrial age
and its cultural anthropology.5

And more recently, Michel Chion describes the distinctions between


the ear and eye worlds in a way that resonates the earlier observations. "In the cinema," he writes, "to look is to explore, at once spatially and temporally, in a 'given to see' (field of vision) that has limits contained by the screen. But listening, for its part, explores in a
field of audition that is given or even imposed on the ear; this aural
field is much less limited or confined, its contours uncertain and
changing."6

TheEye Worldas the Ear World


But even Chion's observations do not necessarily account for filmic
perception-a sense that has been artificially, expressively, technically, and artistically embodied. To be sure, as a self-contained and "self"containing auditory world, the soundtrack typically sustains the "allat-once" characterization of hearing. But as products of technological
enhancement and abstraction, the cinematic eye and ear may not necessarily be constrained by the syntax of their more natural modes of
cognition. The cinematic ear, for example, can behave as the eye when
exchanging its all-at-once in favor of the active selection typical of
vision. In Sam Raimi's The Quick and the Dead (1995), the gunfight between Kid and the Swedish Champion is preceded by a montage of
preparatory moments and objects: a pocket watch snapping open,
bullets removed from holsters, the barrel mechanism of a revolver,
cocking mechanisms tested, and so on. Each image is accompanied
by diegetic sounds enhanced well beyond any natural acoustic projection or the acuity of embodied hearing. Their hyperamplification
suggests essentially zero distance from their visual sources, as though
hearing has not been subjected to a mediating remove or acoustic.
Even while the technical nature of the sequence is highly objectified,
the framing of each shot along with its auditory presence becomes
intensely personal-tactile even.
The result is an inversion of auditory syntax. The precisely attenuated sound parallels the hyperselective editing and camera work. The
ear mimicks vision's "focus" in the anatomical and psychological
senses of the word. In Chion's terms, the aural field here is not "imposed" as an entirety in which we must direct our own attention. It
becomes confined by the frame: sound with certain, fixed contours
and boundaries. Hearing in The Quick and the Dead takes on characteristics of vision that distinguish the eye world from the ear world.
Hearing has reached out, selected an object, excluded others, objec-

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

The Sound of Vision in Film

81

tified them, and sequentialized them. In abandoning its syntax of the


all-at-once, the ear has adopted visual syntax. No longer the passive
organ, the cinematic ear resonates vision by exchanging "passivity"
for active selection. Hearing and seeing have become intertwined not
simply with an image as the nexus, but with the very qualities of visual space as a shared style of perception.
Music presents even more interesting issues in terms of filmic perception and corporeality. Whether diegetic, nondiegetic, intradiegetic, metadiegetic, and so on, its very presence as music suggests that it
is processed like and along with any other aural information. The common designation of "background" music for a nondiegetic score as
well as the often mis-en-scene quality of diegetic music are both affirmations of their place in a larger auditory space-an all-at-once. On
the other hand, music's potential independence from narrative and
visual causality suggests a disembodiment virtually unique in the construction of filmic subjectivity. Unlike its sights and sounds, a film's
nondiegetic music is not part of its stock of enworlded stimuli. Though
clearly a component of the cinematic experience, nondiegetic music
and sound remain thought of but "unheard," to borrow from Claudia
Gorbman. In cartoons or cartoon-like comedies, music can stand in for
real sound and thereby overtly address the ear. But beyond animation,
music's construction of such things as "mood," "foreshadowing," and
"emotion" rarely implies actual perceptual acquisition.
Certainly for nondiegetic music, the implied syntax of cognition
may not be limited to that of hearing itself. Though we hear music,
music is not necessarily constrained by the space of the "ear world."
Or, put in a more familiar way perhaps, since music has been understood to evoke everything from the rather concrete sound of birdsong
(i.e., Vivaldi's Four Seasons or Messiaen's Oiseaux Exotiques) to the
more abstract and nebulous concept of a "premonition" (Schoenberg's
Opus 16 Orchestra Pieces), the notion that music might draw our attention to vision seems far from unprecedented.

Music as VisualSelectivity
One of the most familiar examples of this connection can be found
in the so-called shock chord, a term Roy Prendergast attributes to Scott
Bradley, who scored many of the best Tom and Jerry cartoons.7 Typically a short dissonant burst, the shock chord became a staple effect
during such moments as in Kitty Foiled (1947), when Jerry's new ally,
a canary, suddenly shoves a pistol in Tom's face. Tom's reaction is one
of instant mortification and, as its name suggests, the "shock" chord
would seem unproblematically emotional in its reference and effect.
Beyond emotion, however, closer consideration suggests that this

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

82

Link

music surrogates vision as well. A concise example of what Gorbman


refers to as the "mutual implication" of the soundtrack and image in
film, the shock chord cannot be abstracted from the sensory stimulus
that produces it-most often a visual source of outrage. Falling at
moments in a genre where the punch line to a sequence is typically a
sight gag or image, the shock chord is linked as much to sudden visual revelation as to emotional response. The narrative implication
of the shock chord is not just reaction, but stimulus. Along with its
affective function, the shock chord is the hearing of seeing.
A more telling example can be found in Bernard Herrmann's score
to Hitchcock's Vertigo (1958), in which vision plays a significant narrative role. In spite of the title ailment, Scotty forces himself and Judy
to ascend a steep staircase in the mission bell tower. Scotty peers
down the tower to the accompaniment of a shock chord musically and
emotionally similar to cartoon eye-popping moments. Vertigo intensifies the seeing-hearing relationship, however. Scotty's downward
perspective on the staircase provides one instance among many of the
geometric spiral winding its way throughout Vertigo-a visual motive that is now linked to a musical idea. At each shock Hitchcock's
expressionistic special effect of a downward zoom and upward tracking shot simulates the dizziness accompanying Scotty's pathology.
Along with the object seen, the camera thus emphasizes the subjective act of seeing. Musically, the dissonant chord engenders Scotty's
point of view both objectively as an actively selected "sight" and subjectively in his pathological response to it. The emotive reaction cannot be isolated from the visual perception that prefigures it-the "wild
communication" of the eye. Having sought out its view, the "thing it
has found," Herrmann's shock chord embodies the selectivity of Scotty's seeing.

Music as VisualSpace
Herrmann's score for the Mount Rushmore chase in Hitchcock's North
by Northwest (1959) offers another sounding of vision. As in Vertigo,
the musical effect goes beyond the simulation of a purely emotional
response. From the point when Eve grabs the figurine to when she and
Thornhill reach the back of the monument, their escape has been musically tracked by an essentially rhythmic accompaniment that encodes
such elements as might be expected in cinematic pursuits: the flight,
pacing, pounding heartbeats, and so on. Just before Thornhill's line,
"This is no good. We're on top of the monument," however, Herrmann's score abandons its insistent rhythmic ostinato for a harmonic
interjection that breaks rhythmic continuity and inevitability. Both the
image and the soundtrack at this point imply visual astonishment in

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

The Sound of Vision in Film

83

the edit from the characters to an image of the back of Rushmore revealing its intimidating grandeur. As in Vertigo,we confront a moment
in which vision is the narrative action. Thornhill and Eve have seemingly reached a dead end. And yet it seems even more difficult to hear
the soundtrack as a purely emotional response in this instance than
in Vertigo'sbell tower. Herrmann's musical commentary resonates seeing the monument more deeply than subjective anxiety.
This is confirmed a few moments later in the film. As Eve says,
"What do we do?," there is another such harmonic interjection, this
time accompanying an objective shot of the iconic faces of Rushmore.
The musical emphasis here is on geometric rather than emotional perspective. Elements like size, scope, and distance are briefly foregrounded, concepts that composer and music theorist Robert Morgan
succinctly identifies as among the metaphoric possibilities of music:
Anyone familiar with the philosophical and theoretical literature
dealing with music must be struck by the persistence with which
spatial terminology and categories appear. Indeed, it would seem
to be impossible to talk about music at all without invoking spatial notions of one kind or another. Thus in discussing even the
most elementary aspects of pitch organization-and
among the
musical elements, only pitch, we should remember, is uniquely
musical-one finds it necessary to rely upon such spatially oriented oppositions as "up and down," "high and low," "small and
large" (in regard to intervallic "distances"), and so on. Space,
then, pace Schopenhauer, apparently forms an inseparable part
of the musical experience.8
Morgan's point bears not only on how we "talk about music," but
on how we hear it. Musical listening refers to space in ways that transcend purely acoustic perceptions of distance, location, and orientation. We listen with reference to many of the same categories in which
we see. Spatial constructs are, however, more intimately identifiable
with visual data than with aural. While an object can really be, say,
small or distant in a visual sense, a "small" or "distant" object in a
musical sense may be at a lesser amplitude, "higher" frequency, etc.
In other words, while we can speak quite sincerely of space in terms
of vision, such categories in music are often metaphors.
The selectivity of visual syntax during these moments on Mount
Rushmore arises and becomes emphasized in the tectonic rift between
the musically "horizontal," that is, rhythmic, and the musically "vertical," that is, harmonic. This "spatially oriented opposition" evokes
geometric visual space while, as with the shock chord in Vertigo, sudden interjection encodes and reinforces the eye's penchant for dividing and actively selecting within that space. The musical motivic shift

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

84

Link

in the soundtrack of North by Northwest engenders a modulation of


reference from feeling to perception, from physical action and anxiety to vision. Herrmann's music accomplishes with some effort and
finesse what the eye, camera, and editing appear to do so naturally.
In effect, the score notices, focuses on, and captures a stimulus in a
way evocative of the visual acquisition so prominent in the film's narrative, and so clearly demanded by the spectacular object of Mount
Rushmore itself.

Music as Visual Time


Musically encoded visual syntax can be heard with reference to time
as well as space. This becomes quite clear in the gymnasium scene in
Sam Mendes' American Beauty (2000), where seeing is connected explicitly to temporality. A middle-aged male character, Lester, becomes
visually and erotically fixated on a young woman, Angela, while she
performs in a group dance routine during the halftime of a high
school basketball game. The scene is a vivid consubstantiation of sexual desire, sight, and music. But distinguishing this scene from so
many similar gazes in other films is its explicit engagement with the
temporal implications of visual fascination and the translation of these
implications into musical terms. As Lester fantasizes that Angela is
performing for him alone, his seeing is musically underlined by the
interruption by-and interjection of-contrasting material into a diegetic "host" tune. Just as Lester's gaze separates Angela from the
rest of the girls, Thomas Newman's nondiegetic score suddenly disrupts and displaces the diegetic "On Broadway" along with the rest
of the soundscape. The pep-band rendition of "On Broadway" constitutes a recognizable, forward moving whole-a real-time continuum into which the nondiegetic score insinuates itself as an expansion.
As in North by Northwest, visual attentiveness in American Beauty is
musically constructed by sudden and profound shifts in material and
type of material. More protracted in its effect, however, normal narrative time stops in American Beauty. As in the eighteenth-century opera seria in which real-time action in recitative gives way to the emotional reflection and narrative stasis of an aria, the amount of time
taken to depict the moment is far greater than the amount of time
actually depicted. Lester's fantasy view of Angela becomes an expanded moment of voyeuristic hypnosis. His extreme visual concentration is reconstructed by the image that isolates both Angela and
him and by the soundtrack that offsets his moment in time from the
rest of the sequence. In short, visual fixation forms the nexus of psychological and real time and becomes enhanced by the scene's musical structure.

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

The Sound of Vision in Film

85

But even apart from other elements of the scene such as the closeups of Lester's rapt eyes, the focused lighting, and the sequence's
narrative structure, the musical style of the nondiegetic interjection
is crucial to creating and sustaining Lester's mesmerized view of
Angela. During his phantasm, the ensuing narrative stasis and visual focus are closely encoded by the static quality of Newman's percussive score. Although thoroughly energetic and engaging, the composer's music for this scene goes nowhere. Such a description for this
point in the film should be far from denigrating. On the contrary, the
effect of Lester's hypnosis would be ruined by the imposition of anything else-of music that imparted an awareness of time's inevitable
forward movement. Newman's score is effective not by virtue of its
melodic or harmonic writing, but rather in its very lack of traditionally conceived notions of melody, harmonic progression, tension and
release, anticipation, climax, and anticlimax. All are elements that
have traditionally imbued music with a sense of motion, of development, and all are elements eschewed by the composer's treatment
here. Newman's music for the scene is appropriately nonteleological.
In his work on musical temporality, Jonathan D. Kramer describes
music that presents "a single present stretched out into an enormous
duration, a potentially infinite 'now' that nonetheless feels like an instant." Kramer refers to this musical temporality as "vertical time."
In his view, vertical time appears as part of the expansion of musictemporal possibilities during the twentieth century. These alternative
temporalities arose in competition with the dominant teleology of
western music. Kramer describes the characteristics of vertical music
as follows:
A vertical piece does not exhibit cumulative closure: it does not
begin but merely starts, does not build to a climax, does not purposefully set up internal expectations, does not seek to fulfill any
expectations that might arise accidentally, does not build or release tension, and does not end but simply ceases. It defines its
bounded sound world early in its performance, and it stays within the limits it chooses. Respecting the self-imposed boundaries
is essential because any move outside these limits would be perceived as a temporal articulation of considerable structural import and would therefore destroy the verticality of time.9
Such qualities correspond closely to Newman's score for Lester's
fantasy: sudden onset, lack of formal/dramatic articulation, hasty
exit, and immediate circumscription of timbral and rhythmic resources. Newman's music does not drive the scene onward, but instead keeps it running in place. It does not track a forward motion
that is not there. Instead, it parallels the expansion and suspension

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

86

Link

of time in Lester's gaze. Appropriately, Kramer's metaphor for vertical time derives from vision:
Listening to vertical musical time, then, can be like looking at a
piece of sculpture. When we view sculpture, we determine for
ourselves the pacing of our experience: we are free to walk
around the piece, view it from many angles, concentrate on some
details, see other details in relationship to each other, step back
and view the whole, see the relationship between the piece and
the space in which we see it, leave the room when we wish close
our eyes and remember, and return for further viewings. No one
could claim that we have seen less than all of the sculpture
(though we may have missed some of its subtleties), despite individual selectivity in the viewing process.10
In the vertical time of Newman's score, hearing becomes attentive and
hypnotically fixated in a manner characteristic of visual perception.
The musical stasis of the fantasy gaze is the stasis of visual selection
and concentration, especially when contrasted with the traditional
teleology and closure of the framing diegetic tune. "On Broadway"
returns as suddenly as it was banished, and the sense of having been
in a focused state of suspense is augmented by a return to teleological time and to typical visual and auditory space. But where the
soundtracks to Vertigoand North by Northwest may offer glances, Alan
Ball's screenplay and Newman's score stare. The static musical effect
is of extreme and prolonged concentration, and the film ear is attentive and fixated in a manner more characteristic of visual cognition
than of aural.

Sound and Visual Linearity


Beyond its characteristically active "selection," other aspects of visual cognition may also provide foundations for the auditory encoding
of seeing. McLuhan's idea of the visual field as an "organized continuum of a uniformed connected kind" versus the "simultaneous
relationships" of the auditory also posits ordering as an element of
visual thinking. Emphatic or uniformed ordering in the aural world
would tend to parse hearing into the grammar of seeing. An explicitly "invented" or "uniformed" organization applied to the aural world
may effectively present sonic events by way of visual syntax. "Linear logic," as McLuhan calls it, is a trait of the eye world that might
engender aural strategies.
A wonderful example of overt aural linearity can be seen and heard
in the spectacular opening to the Robert Zemeckis's Contact (1997).
As with objects in our own night sky, the farther away something is

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

The Sound of Vision in Film

87

the further into the past we see. In Contact, recent history is represented by a sequence of radio broadcasts that began emanating from
Earth in the twentieth century. As we journey farther and farther outward from present-day Earth into the deepest reaches of space, the
backwards passage of time is signified by popular songs and news
sound bites that stand as emblems of their times. Beginning in the
late 1990s, we travel back on a carpet of sonic snippets such as The
Trammps' "Disco Inferno," Armstrong's "small step" during the moon
landing, big band music, FDR's "all we have to fear .. ." and Hitler's speeches. All the while we are treated to a visually stunning panoramic tour of the universe beginning with our own planet and solar system, moving out to distant nebulae and galaxies and, tellingly,
concluding with the full-screen image of a child's eye. The temporally receding sequence of sounds becomes, appropriately, a "synoptic"
condensation of time and space-an "overview" of history and cosmography moving from present to past-from here to out there.
Contact's sound design borrows from visual space an ability to observe from outside or above and to conceive in terms of discrete objects. In Contact's synopsis of history, sound becomes ordered and
"uniformed" in a way imitating the ordered visual journey. Rather
than an "all at once," the soundscape is a "one after the other." Contact's prelude spatializes sounds, making separate entities that become
points on its "organized continuum." Here again, the formal aspects
of the sound design are part of its effectiveness. As in Vertigo, North
by Northwest, and American Beauty, sudden juxtaposition plays a pivotal role in establishing the soundtrack's visual reference. Hard edges are articulated, and the extreme fragmentation and disjunction of
the sounds involved are crucial to their objectification and ordering.
Literally and figuratively "spaced out," the auditory panorama of
Contact is hearing by way of a visual purification and rearrangement
of its subject matter.

Music and VisualCulture


Further examining the salient characteristics of visual syntax might
yield other devices and metaphors through which hearing suggests
seeing. But the soundtrack for vision needn't exclusively derive from
perceptual syntax. There is evidence that the ear can masquerade as
the eye by way of references less directly related to the body and perception. Returning to Sobchack's terms, we may hear sight not only
in "wild" communication, but in systematic communication as well.
David Lynch's The Elephant Man (1980) provides an example in
which the musical construction of seeing derives from a premise better described as cultural than perceptual. The film is an account of

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

88

Link

John Merrick, a man tremendously disfigured by an ailment that so


thickens his skin and enlarges his limbs that he is referred to by way
of the animal he is reputed to resemble. Beyond presenting Merrick's
plight, however, the subtext of the film concerns the social and psychological aspects of seeing and being seen. As Chion notes, "the film
contains many faces reacting to the sight of Merrick, faces at a loss,
excited, illuminated or even ecstatic with fascination."11 As in American Beauty, The Elephant Man consistently foregrounds seeing as narratively significant. Vision, and not just image, is a current on which
both films are carried forward. But while American Beauty's
soundtrack produces intense, highly localized visuality, The Elephant
Man's score engenders something far more subtle and long range. This
soundtrack begs to be read in terms of the numerous elements of the
film that derive from looking at the elephant man. The systematic
sounding of vision unfolds by stages over the course of several scenes.
We first encounter the title character in a nineteenth-century amusement fair. With a close-up on a sign reading "FREAK," Treves, a physician, enters the sideshow exhibit in which the elephant man is on
display. The sound of a common-the class associations of that word
waltz is prominent as part of the
are important here-circus-type
fair's mis-en-scene. The carousel-like mechanical instrumentation is the
music of exhibition. It is not the tune as much as its orchestration that
culturally encodes an atmosphere of display and curiosity. Merrick
is treated as a "monster," and the circus waltz is the music of low
spectacle that reinforces his abject situation as a repulsive object to
be stared at. The eye's probing curiosity is sewn into the film with
the thread of cultural association.
As the film progresses, the waltz genre emerges most prominently
at two further points. The elephant man comes to live in a secluded
hospital room where a sadistic porter capitalizes on his position to
run a freak show of his own. During a scene in which the elephant
man is put on display in his own room to an invading "audience" of
abusive drunks and whores, the soundtrack again strikes up a waltz.
The metallic element of the orchestration here is reminiscent of carnival metallophones. However, the now overbearing brass, mock elegance of the strings, and hefty dissonances identify this waltz not
as the circus variety, but as a kind of "dance macabre." Although sustaining no direct thematic connection to Camille Saint-Saens's composition with that name, the Morris waltz forthrightly invokes the
same category of the grotesque, one of the topoi of romanticism in the
nineteenth century common to literature, painting, and music.
Chion calls this waltz "an implacable scherzo reminiscent of Mahler." And indeed, Mahler drew frequently from the well of the grotesque as thirstily as did other romantics such as Liszt in his Toten-

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

The Sound of Vision in Film

89

tanz or Berlioz in his Symphonie Fantastique. We can also find the decadent, grotesque waltz associated with the pathological in Richard
Strauss's Salome and Elektra. Shared in these are not only dance type
and style, but an underlying aesthetic of distortion. The nineteenthcentury fascination for the misshapen image complemented its appreciation of the grotesque: the image of the misshapen. Geared toward the production of imagery, as so much of romantic music was,
these concepts are, like Morgan's spatial metaphors, natively visual
categories. The very nature of the grotesque compels us to look even
against our will and seemingly punishes us for doing so. Underlining the cruelty of the situation along with the crude sensibilities of
his abusers, the misshapen waltz aggressively reinforces the elephant
man's position in the scene as a distortion. Precisely because it is repulsive, the grotesque incites desire for closer inspection. Significantly,
the score backs off long enough for the elephant man to be forced to
look at his image in a mirror. After his horror at seeing his own face
the mocking waltz resumes with conviction. Seeing the elephant man
see himself is the pinnacle of his humiliation, a debasement augmented in the score's repulsive fascination.
Further on, however, the cruel voyeurism of the porter's sideshow
and the significance of its waltz macabre pave the way for a wholly
different sort of amusement and waltz. The penultimate scene in
which Dr. Treves invites the elephant man to watch a theater production marks the climactic shift in his status: the film emphasizes the
elephant man's own gaze. As the stage production begins, the shot
of him being handed an opera glass and invited to look forms the crucial moment. Lynch unleashes a collage of reaction and point-of-view
shots now developing the elephant man's act of seeing. For him to
watch so publicly instead of being watched so publicly is quite a new
experience. Indeed, the theater scene also culminates the musical
transformation enacted over the entire film. The waltz genre now reappears significantly as one of grandness and brilliance. This new
topic shines luminously against the dimly lit diversion of carnival tent
music and the misshapen imagery of the waltz macabre. The cultural coding of the score modulates from a low style of mechanical instruments to a high style in which strings and harp predominate.
Musically and socially we have gone from amusement to occasion.
The theater scene displaces the sound of the opening "freak show"
and its one-way gaze with music of more social entertainments, such
as a ball, reception, or other large public gathering. There the gaze is
mutual and invited rather than unilateral. The musical change is in
the status of spectatorship itself and accompanies the elephant man's
transformation from passive, curious object to active and attentively
gazing subject who is himself now "curious." The grandness of the

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

90

Link

waltz sanctions and magnifies his seeing just as the circus waltz sanctioned his being stared at. The visual significance of the waltzes on
the soundtrack are an integral part of the character's social transformation from the elephant man into John Merrick, from spectacle to
spectator.
Thus, even without invoking the notion of synesthesia, the idea that
the soundtrack can transduce seeing suggests that to use the ear is
not necessarily only to hear. Synesthesia may be another means of
exploring that possibility but does not acknowledge the ways in
which body and culture play on each other. Some aspects of perception are learned and provide cinematic style and signs, while cultural artifacts like cinema often refer to perception and may ultimately
reconfigure it. As such, it would be fairer to suggest that the most effective underscoring of seeing as action takes place when sound and
music form an aural theory or description of vision, rather than an
alternate means of sensing it. Or perhaps it is a way of experiencing it,
as we do with thought itself, without sensing it per se. In any case, if
film can thus extend, encode, and simulate vision by way of the ear,
then there may be far more seeing going on in cinema than actually
meets the eye, so to speak. Even the clear dominance of visuality in
film may not be enough for it. Or, in terms of what we might consider an early theory of film's sensory complex, found in Ecclesiastes:
"The eye is not satisfied with seeing, nor the ear filled with hearing."
This truth is there to be seen, while that seeing itself may ask to be
heard.
NOTES
1. Vivian Sobchack, "Phenomenology and the Film Experience" in Viewing Positions,
ed. Linda Williams (New Brunswick: Rutgers University Press, 1995), 37.
2. Ibid., 37.
3. Marshall McLuhan and Quentin Fiore, The Medium Is the Massage (1967; San Francisco: Jerome Agel and HardWired, 1996), 111.
4. Rudolf Arnheim, Visual Thinking (Berkeley: University of California Press, 1969),
19.
5. Theodor Adorno and Hanns Eisler, Composingfor the Films (1947; London: Athlone Press, 1994), 20.
6. Michel Chion, Audio Vision, ed. and trans. Claudia Gorbman (New York: Columbia University Press, 1994), 33.
7. Roy Prendergast, Film Music: A Neglected Art (New York: Norton, 1977, 1992), 187.
8. Robert Morgan, "Musical Time/Musical Space," Critical Inquiry (Spring 1980): 527.
9. Jonathan D. Kramer, "New Temporalities in Music," Critical Inquiry (Spring 1981):
549.
10. Ibid., 551.
11. Michel Chion, David Lynch (London, British Film Institute, 1995), 60.

This content downloaded from 169.229.11.216 on Wed, 21 Oct 2015 01:44:56 UTC
All use subject to JSTOR Terms and Conditions

You might also like