You are on page 1of 19

D.

PROJECT DESCRIPTION
1 Introduction

This proposal concerns how the locations of sounds are represented in the primate brain.
Knowing where a sound is coming from provides important survival benefits for predator and
prey alike, and is particularly critical for species that communicate vocally, such as humans and
monkeys.

The locations of sounds can be inferred by comparing sound level and arrival time across the
two ears: a sound located to the right will be louder in the right ear, and arrive at that ear
sooner, than the left ear. These interaural timing and level differences, together with direction-
dependent differences in the spectral filtering by the external ear (spectral cues), are the basis
of the perception of sound location.

What kind of neural representation emerges from these binaural and spectral cues has been the
subject of considerable experimental effort. In the visual system, retinotopic maps consisting of
neurons with closely circumscribed receptive fields are the rule. The search for such maps in
the auditory system has come up dry, at least in terrestrial mammals.

If the brain does not contain auditory maps of space,


what does it do instead? The alternative appears to be
a monotonic code for sound location (Figure 1), for
which we and others have found evidence in several
auditory-responsive areas (McAlpine et al., 2001; Groh
et al., 2003; Zwiers et al., 2004; Werner-Reiss and
Groh, 2008) (for related, see also Middlebrooks et al.,
1994; Middlebrooks et al., 1998; Furukawa et al.,
2000). In this type of code, individual neurons do not
have circumscribed receptive fields, responding only if
a sound arises from a particular location in the auditory
scene, but rather they respond broadly to sounds
across a wide range of locations. The vigor of the
response varies with sound location, reaching a
maximum for sound locations at the “edges” of the
auditory environment, defined as the axis connecting
the two ears. In short, the response functions match Figure 1. Schematic of monotonic
the shape of at least the binaural difference cues, responses as a function of horizontal
which reach their maximum values along that axis. sound location, as found in several
areas of the primate brain. There is
There may be variability in the exact shape of the a one-to-one correspondence
response pattern – different slopes or recruitment between location and firing rate for
thresholds – but such variability is not necessary. The single sounds, suggesting that firing
rate could form a code for sound
point is that an ideal observer “reading” this neural
location. This poses a potential
activity pattern would be able to infer the location of the problem for coding multiple
sound by assessing the level of activity of the active simultaneous sounds, because
neurons, rather than by assessing which neurons were neurons cannot discharge at more
active, as would be the case for a map of neurons with than one rate at a time.
circumscribed receptive fields.

D-1
The key problem with this type of
representation is that it is unclear how it could
represent the locations of multiple simultaneous
sounds. In a map, multiple stimuli at different
locations activate different populations of
neurons, so information regarding the
existence and locations of multiple stimuli can
be preserved. For a monotonic firing rate code
for sound location, neurons cannot have more
than one firing rate at a time, so it is not
obvious how such a code can represent more
than one sound location.

In the proposed experiments, we will test how


monkeys perceive multiple sounds, and we will
simultaneously conduct neurophysiological
recordings in the inferior and superior colliculi
to determine how neurons in these brain areas
mediate this perception.

2. How do monkeys perceive Figure 2. A. Frequency response function


multiple sounds? of a multi unit cluster in the IC, showing
vigorous responses to sounds from 400-
650 Hz. B. Point image of responses
Humans are sometimes able to resolve the across the population of recording sites in
locations of multiple sounds, but this ability is the IC. From (Bulkin et al.).
limited in scope compared to other senses.
Consider the visual scene around you: you are
able to distinguish an arbitrarily large number of different visual stimulus locations – the many
individual letters on the page of this proposal, for example. In contrast, how many different
auditory stimuli can you distinguish? You can probably perceive more than one, but not more
than a handful. This is due in part to the tremendous challenge of constructing a percept of
multiple sound locations from only two samples of the physical stimulus, i.e. two measurements
of air pressure waves taken at each ear.

Indeed, when two sounds are identical in frequency content and temporal envelope, they are
generally perceived as a single sound source (Ebata et al., 1968; Gardner, 1968; Perrott et al.,
1970; Gaskell, 1983; Lindemann, 1986; Blauert, 1997). That single sound may be at the
midpoint of the two actual sources, if the two sounds are delivered simultaneously (summing
localization). If one sound is delivered slightly before the other, the first sound is localized
accurately and the second is ignored (the precedence effect, Wallach et al., 1949). This is
thought to be adaptive given that delayed but identical copies of a sound naturally occur as
echoes rather than independent sound sources.

The more interesting situation for our purposes is when two sounds differ in frequency and can
be localized distinctly (Perrott, 1984). That it is possible, for humans at least, to distinguish two
sound sources means that there is indeed a mystery to be solved. How can we reconcile the
neural evidence for monotonic coding of sound location and its apparent inability to represent
multiple simultaneous sound locations with the perceptual evidence that humans can do this?

D-2
One potential answer is that because the two sounds must differ in frequency in order to be
distinguished (Perrott, 1984), different populations of neurons represent each sound location. In
other words, neurons tuned for the frequency of sound A represent the location of sound A and
neurons tuned for the frequency of sound B represent sound B. In principle, this scheme could
work and we will investigate this possibility in the proposed experiments.

It is not obvious, however, that this is necessarily what is happening. In particular, the
frequency difference needed for dual localization differs substantially from the frequency
bandwidth of IC neurons. The frequency difference needed to distinguish two sounds can be
quite small – on the order of 20-25 Hz when the two sounds are in the 500 Hz range (Perrott,
1984). This is far smaller than the frequency bandwidth of primate IC neurons. In our existing
data set of frequency tuning curves among IC neurons, it is very rare for neurons to respond to
a 500 Hz stimulus and not a 525 Hz stimulus. This is illustrated with an example multiunit and
population response profile from frequency mapping study in the IC that we have recently
conducted (Figure 2, Bulkin et al.). More than 80% of sites in the IC are responsive to both 500
and 650 Hz. So, whereas there would certainly be some neurons that respond to one sound
and not the other, there is a much larger population of neurons capable of responding to both.

The possibility that perhaps humans and monkeys are different in some critical respect requires
consideration. The likely suspects are:
1. Humans might have an auditory map, not a monotonic code. This map allows them
to localize multiple sounds. Monkeys might not have this capacity. Because the evidence for
monotonic coding of sound location comes only from monkeys and other animal species,
whereas the perceptual studies were done in humans, this is within the realm of possibility.
2. Humans might have monotonic coding of sound location, but human auditory neurons
might be much more narrowly tuned in frequency so that very different populations of neurons
represent each distinguishable sound. Monkeys might then require a much larger separation in
the frequency domain to distinguish two sounds.

The alternative to these


possible cases of species
differences is that humans
and monkeys might be
essentially the same. They
might have similar spatial
representations, frequency
tuning, and perceptual
ability to localize multiple
sounds. This would
suggest that a currently
undiscovered neural
mechanism that supports
this multiple sound
localization in both species.
Given the similarities
between the auditory
niches of humans and
monkeys, we favor this
interpretation (although the
proposed experiments will Figure 3. Logic flow chart indicating possible outcomes.
tell us if we are wrong). We

D-3
have several theories about what this mechanism might be, described in the next section.

Because is difficult to evaluate the format of a representation using fMRI in humans (for
discussion of this issue, see Werner-Reiss and Groh, 2008), it makes the most sense to bridge
the human behavior – monkey neurophysiology gap by testing perception in monkeys. This will
allow us to study sound localization behavior and its neural basis simultaneously in the same
preparation. A flow chart illustrating the logical flow of the experiments and their potential
outcomes is shown in Figure 3.

Monkeys will be trained to report the locations sounds by making eye movements to the
locations of each sound (Figure 4). Monkeys naturally look at sounds, and we have ample
experience training monkeys on this task when one sound is presented (Metzger et al., 2004;
Mullette-Gillman et al., 2005; Metzger et al., 2006; Mullette-Gillman et al., 2008). To assess
whether monkeys perceive more than one sound, we will present two sounds simultaneously
and the monkey must make a saccade to each, in any order. Monkeys will receive a reward
after having looked at both targets.

This task will demonstrate whether monkeys are able to accurately localize each sound, how
this depends on the frequency separation between the two sounds, and what pattern of errors
they make when they fail to localize each sound. The possible outcomes are tied to particular
predictions for how neurons in the brain might respond when the two sounds are presented.
Thus, I will describe the neurophysiological experiments, which will be conducted
simultaneously, i.e. while the monkeys perform the behavioral task, and present the possible
behavioral and neurophysiological outcomes in tandem.

3. Neural coding of multiple sounds and relationship to behavior

In our previous and ongoing work, we have investigated how neurons in the inferior colliculus
(IC), auditory cortex (AC), parietal cortex (the lateral and medial banks of the intraparietal
sulcus, L/MIP), and superior colliculus (SC) respond as a function of sound location. In the IC
and AC, responses vary monotonically as a function of sound azimuth reaching a peak at the
axis of the (usually) contralateral ear (Figure 5A and B) (Groh et al., 2003; Porter and Groh,
2006; Werner-Reiss and Groh, 2008). Preliminary results from L/MIP and the SC show a

Figure 4. Behavioral task.

D-4
similar pattern (Figure 5 C and D). In this study, we will concentrate on the role of the IC and
the SC in perception of multiple sounds.

The IC is of particular interest because it is the


earliest of the auditory areas in which spatial
sensitivity has been investigated in the primate.
The IC is an ascending way-station through which
nearly all auditory signals must pass, and thus if
perception of multiple sounds is to be achieved,
signals in the IC would seem to have to
successfully encode them. In short, the IC is an
information processing bottleneck: information not
encoded in the IC should not be present in later
stages of processing. Previous studies of the
multi-sound representation in the IC and other
areas have focused on the precedence effect
rather than conditions that support dual
localization, and have not been paired with
behavioral testing (Litovsky and Yin, 1998a, b;
Mickey and Middlebrooks, 2001; Litovsky and
Delgutte, 2002; Mickey and Middlebrooks, 2005).
In addition, there are several studies in barn owls
(Keller and Takahashi, 1996, 2005; Takahashi et
al., 2008), but barn owls have auditory maps so
the monotonic coding conundrum does not apply.

How do IC neurons respond when two sounds are


presented simultaneously? In the next few
sections, we consider several possibilities, some of
which could support accurate dual localization, and
others which would not. First, we discuss the
possibility that place coding by frequency could
permit multiple-sound localization, then we cover
several possible outcomes involving failures to
localize multiple sounds, and finally we turn to
some possible outcomes that could permit
successful localization of multiple sounds despite
minimal differences in their frequencies.

3.1. Frequency-based sorting: As described in


the previous section, one possibility is that the
problem of the same neurons encoding two
locations in a monotonic code can be avoided if
two different neural populations represent each of
the two sounds. Given that the two sounds must
differ in frequency in order to be distinguished,
there should be some relationship between the
Figure 5. Example neurons showing frequency bandwidth of neural response patterns
predominantly monotonic sound location and the frequency separation needed to distinguish
sensitivity. the locations of two sounds. Over the years, we
have collected hundreds of frequency response

D-5
functions in IC neurons (see for example Figure 2); it only remains to conduct the behavioral
testing to determine what the relationship is between the neural representation of sound
frequency and the perceptual ability to distinguish the locations of sounds of different
frequencies. The behavioral testing for this question will employ the task described in Figure 4.
The two sounds will consist of bandpass noise with different center frequencies, and we will
determine what frequency difference is necessary for dual localization, and under what
conditions only a single sound location is perceived.

If monkeys can localize two sounds having a frequency separation smaller than the bandwidth
of a substantial population of IC neurons, or if they cannot localize two sounds under any
conditions, the question of how IC neurons support/fail to support such perception becomes
interesting. In the next few sections we consider several possible neurophysiological outcomes
and their potential relationship to behavior. We specifically focus on how neurons respond to
combinations of two sounds when each of them can drive the neuron when presented
separately. Given the possibility that monkeys might not succeed at localizing each sound, we
consider some possible behavioral patterns.

3.2. Summation: One possibility is that neurons might respond to two sounds by firing at a

Figure 6. If neurons sum their inputs, then the response to two sounds should be
greater than the response to either sound alone (left panels). The expected behavioral
correlate depends on how neural activity is "read out" to produce perception. If
perception of sound location in one hemifield is governed only by neurons with
monotonically increasing response patterns within that hemifield, e.g. neurons in the left
IC for the right hemifield of space, then the combination of two sounds within one
hemifield should be perceived as a single sound positioned more eccentrically than
either actually sound (middle panels). If perception is governed by the ratio of activity of
neurons with both rising and falling response patterns, such as the combination of left
and right IC neurons, then the monkey should behave as if there is a single sound
located between the actual sounds (right panels).

D-6
rate that corresponds to the sum of their
responses to each sound individually (Figure
6). This possibility might seem sensible if one
views neurons as linear devices that simply add
up their total inputs and respond more strongly
when they receive more input. However, if the
firing rate of neuron encodes sound location,
then a summation response to two sounds is
not sensible at all, because individual neurons
will respond to two sounds at a firing rate that
would normally correspond to a location more
Figure 7. If neurons respond to dual sounds contralateral than either of the two actual
with a firing rate that is the average of the sounds.
responses evoked by either sound alone (left),
then we expect the behavioral responses to What behavioral pattern would correspond to
"split the difference" as well (right panel).
this neural response pattern? It is unlikely that
this neural response pattern will occur in
conjunction with successful multiple sound localization by the monkeys. If monkeys report a
single sound location, there are two possibilities about where that single sound will be perceived
to be, depending on how the neural activity patterns in the IC are “read-out” to produce the
perceptual judgment and behavioral response.

If sound location judgments in one hemisphere of space are governed by a population of


neurons with rising monotonic functions for that hemisphere, and no comparison is made to
neurons with falling monotonic functions, then
monkeys should report two sounds within that
hemisphere as being located at a single
location, more peripheral position than either
actual location alone (Figure 6 middle
panels). Lesion studies have indicated that
the representation of space is controlled by
the contralateral IC (Jenkins and Masterton,
1982), so this is possible, but it would
certainly be startling given that humans do not
show this behavior pattern, not to mention
that it would be highly maladaptive to
mislocalize the combined sounds so severely.

A more likely situation is that the ratio of the


activity levels of rightward- and leftward-
preferring neurons governs perception. In
this scenario, a summation pattern in
individual neurons should lead to averaging Figure 8. Predicted results for "Winner-Take-All".
judgments, in which monkeys judge the two On different trials, neurons might respond with a
sounds to be a single sound at an level corresponding to either individual sound.
intermediate location, because the activity The monkey's behavior might match: If the
levels of both the leftward and rightward pools neuron responds at a firing rate corresponding to
of neurons would be too high and would tend sound A, the monkey should look only at sound A
to cancel out (Figure 6 right panels). This on that trial. Alternatively, if two separate
behavior pattern would correspond to populations encode sound A and sound B, the
monkey should localize each sound.
summing localization observed in humans.

D-7
3.3. Averaging: A second possibility is that neurons might respond to two sounds with a firing
rate that is intermediate between the responses evoked by either sound in isolation (Figure 7).
Again, this response pattern is unlikely to be observed if monkeys successfully localize each
sound. If they report only one sound location, the predicted perceptual pattern associated with
neural averaging is that it should be at an average location. Note that this predicted behavioral
result is the same as for summation with a readout involving the ratio of leftward and rightward
preferring neurons, but the neural response pattern is very different. Thus, one of the key
advantages of conducting both the behavioral and neurophysiological studies together is that it
is only through this combined approach that the summation and averaging possibilities can be
fully distinguished.

3.4 Winner-Take-All. In this scenario, neurons would respond with a firing rate corresponding
to one sound location or the other on any given trial. This neural pattern could accompany either
successful localization of multiple sounds or localization of only a single sound. If the entire
neural population encodes the same sound location, then there should be an excellent
correlation with behavioral choice on a trial-by-trial basis: if the neurons respond at a firing rate
corresponding to sound A, the monkey should saccade only to sound A on that trial (Fig. 8).

Winner-take-all could also support perception of multiple sound locations, if there are different
populations of neurons that encode each of the two sounds. In this case, the monkeys should
be able to saccade to each sound, and there might be little correlation on a trial-by-trial basis
between the responses of individual neurons and sequence.

This idea leads to the fourth general possibility - an alternative


way that multiple sounds could be represented - described in
the next section.

3.5. Multiplexing. In telecommunications, multiple phone calls


can be carried on a single wire by taking turns, rapidly, so that
at any instant in time the signal corresponds to one phone call.
By alternating between calls, all the calls can be represented
in a single signal. This is known as multiplexing. It is possible
that auditory neurons do the same thing: when confronted
with multiple sound locations to be encoded, neurons might
alternate rapidly between responding to one and responding
to the other. Across time, both sounds would be represented
(Figure 9). This would allow monkeys to successfully localize
both sounds.

A theoretical example of what such a response pattern could


look like is illustrated in the top panel of Figure 9. The dashed
Figure 9. Individual neurons lines indicate the average firing rate corresponding to sounds
may alternate between firing at A and B presented singly. The solid line suggests how an
the rates corresponding to individual neuron might bounce back and forth between the A-
sound A and sound B. This and B-firing rates when both are presented.
could allow both locations to
be encoded in the same spike A challenge in determining whether this pattern occurs is that
train, a process known as the precise timing of the switching could vary from trial to trial.
multiplexing. Conventional measures of spiking activity such as a
peristimulus time histogram would obscure an unsynchronized

D-8
alternation pattern: the firing rate would likely
balance out to be intermediate between A and
B, so the pattern might on first glance appear
to correspond to averaging.

To get around this challenge, we plan to record


from multiple neurons simultaneously. The
idea is that neurons might switch between
encoding A and B in synchrony. Particular
pairs of neurons might show either positive
correlations (Figure 10, left panels), in which
both neurons in the pair respond to the same
sound at the same time, or anti-correlations
(Figure 10, right panels), in which their
response patterns are the mirror image of each
Figure 10. Pairs of neurons may be either other.
positively or negatively correlated.
If most or all neurons show positive
correlations, this suggests that the entire population of neurons is oscillating back and forth in
synchrony, such that at any given instant in time there really is only one sound location
represented, although across time both may be represented. If this occurs, then we would
expect to see a relationship between which sound location is encoded at a particular instant in
time, and which sound location the monkey
saccades to first. For example, if on a given trial
the neural population is responding at a level
corresponding to sound A at the time the cue to
look to the sounds is given, the monkey should
saccade to that sound first (e.g. Figure 11).

3.6 Other considerations: The vertical dimension.


The representation of the vertical component of
space in the primate auditory pathway has not
been extensively studied (Zwiers et al., 2004). In
humans and monkeys, vertical localization is
based on spectral cues (for reviews, see Blauert,
1997; Moore, 1997) and localization accuracy is
considerably lower than for the horizontal
component (e.g. Jay and Sparks, 1990). Unlike
binaural difference cues, spectral cues probably do
not vary monotonically with sound elevation (for
review, see Hofman and Van Opstal, 1997), so
there is not necessarily any reason to suppose that Figure 11. Behavioral choice may be
the vertical component of space is encoded correlated with the firing rate of neurons: if
monotonically (but see Zwiers et al., 2004). the neuron is firing at a level corresponding
to sound B at the time the monkey
Because more is known about the encoding of prepares to look to the sound, the monkey
horizontal sound location, we will largely focus on may look at sound B first (e.g. Trial 1)
the horizontal dimension. However, it is possible whereas the opposite neural and
that tuning in the vertical dimension could behavioral pattern may occur on other trials
contribute to separation of horizontal targets: (e.g. Trial 2).
perhaps monkeys cannot distinguish two sounds at

D-9
different horizontal locations at the same elevation but they can distinguish two sounds at
different elevations, because different populations of neurons would be involved. We will test
this possibility by varying sound location in elevation as well as azimuth.

3.7. Potential Outcomes in the Superior Colliculus The SC is an oculomotor structure strongly
implicated in the control of saccadic eye movements (for reviews, see Sparks and Mays, 1990;
Knudsen, 1991; Moschovakis, 1996; Wallace and Stein, 1996; Munoz, 2002; King, 2004).
Although there have been numerous studies of auditory signals in the SC in other species (King
and Palmer, 1983; Middlebrooks and Knudsen, 1984; King and Palmer, 1985; Meredith and
Stein, 1986a, b; Populin and Yin, 1996; Wallace et al., 1998), there have been very few in
primates (Jay and Sparks, 1984, 1987a, b; Wallace et al., 1996). As the representational format
was unknown, we have begun a study to determine whether the SC employs a place or
monotonic code for sound location. As shown in Figure 5D, our preliminary results suggest that
the SC employs a monotonic code for sound location. Even if the SC were to employ a map (as
it does for visual stimuli), however, this map would be based on prior monotonically-coded
signals such as those in the IC and AC. Accordingly, the same general question of how multiple
sounds are encoded applies to the SC as well..

Because the SC is so much closer to the motor output, it serves as a valuable point of
comparison to the IC. The SC is situated after processes such as attention and target selection
may have had a chance to begin to operate (Moore and Fallah, 2001; Krauzlis and Dill, 2002;
Carello and Krauzlis, 2004; Krauzlis et al., 2004; Kim and Basso, 2008). We may find that the
IC represents both sounds, but the SC does not, reflecting its role in controlling motor behavior
– since there can be only one saccade at a time, the SC may only represent the next saccade in
a winner-take-all fashion. There may also be a significantly better correlation between neural
activity in the SC and the first target chosen by the monkey in the sound sequence task than in
the IC.

3.8 Summary. The vast majority of previous research on the representation of sound location
in the brain has been conducted using single sounds presented one at a time. Yet, the results
of studies in monkeys and other terrestrial mammals has suggested that extending the findings
from the simple single-sound case to the complex auditory scenes that occur in natural settings
is highly non-trivial because of the use of monotonic coding of sound location. The current
experiments should determine whether the animal species in which such monotonic coding has
been extensively demonstrated is in fact capable of detecting and localizing multiple sounds,
and how the brain accomplishes this task despite the apparent limitations of this
representational format.

In the next section, I will detail the connections between the proposed work and previous work
in the lab supported by NSF. Section 5 will provide additional methodological details, and
section 6 will cover the broader impact of the proposed work.

4. Relationship to Prior NSF Funding

Our previous NSF project involved the frame of reference of auditory information. Visual spatial
information arises in an eye-centered frame of reference based on the site of activation on
retina, but binaural and spectral cues provide information about the locations of sounds with
respect to the head. Integration of visual and auditory information, then, would seem to rely on
successfully incorporating information about the position of the eyes with respect to the head
with either visual or auditory signals or both.

D-10
Several aspects of this work are relevant to the current proposal. First, two of the previous
studies involved performance of an auditory saccade task, and recordings in the IC during
performance of this task (4.1 and 4.2) (Metzger et al., 2004; Metzger et al., 2006). Thus, we
have extensive experience with training animals on this task. Additionally, one of the goals of
the project was to investigate the representational format of eye position in IC neurons (4.3).
We found the format to be monotonic, in parallel with the IC’s monotonic representation for
sound location itself (Porter et al., 2006).

4.1 Do auditory signals get translated into an eye-centered frame of reference? We


investigated this question by determining whether monkeys can successfully compensate for
eye position when making saccades to sounds. We found virtually complete compensation for
initial eye position, implying a successful translation of auditory signals from head-centered to
oculomotor coordinates (Metzger et al., 2004).

4.2 Does the neural representation of sound depends on behavioral context? Little is known
about the extent to which IC neural activity may be influenced by context such as impending
reward associated with successful performance of an auditory saccade task. We found that
neural activity increased late in the trial in the saccade task, and that the level of activity
throughout the trials could be modulated by reward magnitude for many neurons. This finding
suggests that the IC does not just passively report the sounds present in the environment, but
also contains signals related to the import of those sounds (Metzger et al., 2006).

4.3 What is the shape of the eye position signal in the primate inferior colliculus (IC)?
Approximately 40% of our sample of 153 neurons showed statistically significant sensitivity to
eye position during either the presentation of an auditory stimulus or in the absence of sound
(Bonferroni corrected p<0.05). The representation for eye position was predominantly
monotonic (Porter et al., 2006).

4.4 Other studies. Several other studies were conducted (Mullette-Gillman et al., 2005; Bulkin
and Groh, 2006; Porter and Groh, 2006; Werner-Reiss et al., 2006; Porter et al., 2007; Mullette-
Gillman et al., 2008; Groh and Pai, In press; Maier and Groh, In press; Bulkin et al., Submitted.)
but space limitations preclude a detailed description. Attainment of educational objectives for
the previous funding period are described further in section 6.

5. Detailed Experimental Methods.


We employ the following conventional techniques and procedures for conducting
electrophysiological and behavioral experiments in alert Rhesus monkeys (Macaca mulatta):

5.1 Surgical Preparation: Animals are prepared for experiments by surgically attaching a head-
holding device, a recording cylinder and a device for measurement of eye movements
(Robinson, 1963; Judge et al., 1980).. All surgeries are carried out under aseptic conditions
using general anesthesia. Post-surgical care involves suitable pain medication and is overseen
by the attending veterinarian.

5.2 Stimuli: The experiments are conducted in one of three IAC single-walled sound attenuation
chambers rendered anechoic using foam insulation (Sonex). Nine or more loudspeakers
(Audax) are arrayed horizontally spanning -24 to +24 degrees. The auditory stimuli consist of
tones or bandpass noise (frequency range 400 Hz to 18 kHz, 55 dB SPL, 10 ms on/off ramp).

D-11
5.3 Eye movements: Eye position is monitored at all times during an experiment (500 Hz
sampling rate). Head position is fixed.

5.4 Recording: Single neurons are recorded using conventional tungsten microelectrodes in
conjunction with appropriate amplifiers, filters, and a dual-window discriminator (Bak
Electronics) or Plexon template-based spike sorting-system. Multiple neurons will be recorded
from the same electrode (spike-sorting using the Plexon) or via two or more electrodes lowered
independently into the IC and SC.

An experimental session will begin with the search for an auditory cell by advancing the
recording electrode(s) while broadband sounds and tones are presented to the monkey. Once
one or more auditory neurons have been identified and isolated, the experiment is conducted.
The time of occurrence of each action potential is stored with a temporal resolution of 0.1 ms.

5.5 Localizing brain regions: Approaches to the IC and SC are made at an angle of 33 degrees
in a coronal plane. We will use a combination of anatomical MRI, histology, and physiological
markers to localize our recording sites. The pattern of frequency sensitivity provides information
about IC recording location; saccade-related activity facilitates identification of the SC. MRI
provides information concerning the gross location of morphologically distinct brain areas such
as the IC or SC (Groh et al., 2001; Werner-Reiss et al., 2003). We collect T1- or T2-weighted
images using a 5" receive-only surface coil. In these images, the structure of interest and the
recording cylinder can be seen quite clearly. At the conclusion of each animal's participation in
these experiments, small electrolytic marking lesions will be made in selected electrode tracks
to allow reconstruction of recording site locations. Animals will then be euthanized and
perfused. The brains will be removed and fixed in a formalin solution. Brains will then be frozen
and sliced in 50 µm sections before being mounted and stained. Alternating sections will be
processed for acetylcholinesterase histochemistry, parvalbumin immunoreactivity, and/or cresyl
violet staining (Hackett et al., 1998).

5.6 Behavior: Training, reinforcement, and potential pitfalls: Animals are trained by operant
conditioning using water or juice as a positive reward. For experiments requiring saccades to
auditory targets, animals are trained first on visually guided saccade tasks, then on the auditory
saccade tasks (Metzger et al., 2004; Mullette-Gillman et al., 2005; Metzger et al., 2006;
Mullette-Gillman et al., 2008). The events of the task in time are shown in Figure 12. Once the
animals have mastered the single sound auditory task, we will introduce trials with two sounds.
During training, the two sounds will both be presented during the delay period, one after the
other. Monkeys will be rewarded for looking at both sounds in either order. Then, we will
introduce a small proportion of trials - ~10% - with two sounds at the same time (Figure 3, 12),
with the same elevation but different horizontal locations. Initially, we will reward the animal for
making a saccade to any horizontal location provided the elevation of the saccade is
appropriate. This is necessary because we do not know if monkeys have the capacity to
distinguish two sounds, and we do not know how they might mislocalize the two sounds if they
cannot distinguish them. If the monkeys do saccade to both sounds, we will switch to using
reinforcement windows around the two sounds, but if they do not, we will continue to reward
them for making saccades to the proper elevation but any horizontal position, and we will keep
the proportion of double sound trials low. We will monitor performance on the interleaved
single-sound trials to ensure that the monkeys continue to show a diligent effort to make
saccades to the locations of sounds to the best of their ability.

The major potential pitfall of this study is that it may be difficult to train monkeys to perform the
two sound version of this task, even if they have the perceptual capacity to do so. We believe

D-12
we can accomplish all of
the major goals of this
study without this task. If
we run into difficulty with
training, we will have the
monkeys perform mostly
single sound trials, and we
will simply continue to
include dual sound trials as
probe trials on a fraction of
the trials. If monkeys
Figure 12. Events of the behavioral task in time. sometimes localize sound
A and sometimes sound B
on those probe trials, that will still permit us to consider each of the potential neurophysiological
mechanisms. Of course, we would prefer to have the monkeys report the location of each
sound on each trial, as this would be more solid proof, but we will be able to proceed with the
behavioral task that we’ve already successfully trained 8 monkeys to perform if necessary.

5.7 Recording: Data analysis and interpretation: While monkeys perform interleaved single and
double sound trials, we will record the activity of one to several neurons in the IC or the SC, and
seek evidence of behavioral and neural response patterns resembling the possible outcomes
outlined in Figures 6-11 (sections 3.1-3.5). For IC neurons, neural discharge will be analyzed
chiefly during the delay period (Figure 11), when eye position and the visual scene are
constant, and the reward is distant in time (Metzger et al., 2006). For SC neurons, we will
chiefly analyze activity during the delay period, but we will also investigate activity time-locked to
the saccades to the two sounds (van Opstal and van Gisbergen, 1990; Glimcher and Sparks,
1992).

The neural response patterns associated with summation is relatively straightforward, but
averaging, winner-take-all, and multiplexing could all resemble each other and will require
several tests to distinguish. Under all three, the overall spike count during a conventional spike-
counting window such as the 500-900 ms delay period, averaged across trials, should be
approximately equal to the average of the responses to either sound alone. Winner-take-all
should be distinguishable from actual averaging and multiplexing if there is a correlation on a
trial-by-trial basis between the average spike rate on a given trial and the monkey’s choice of
which sound he looks at, or which sound he looks at first, or if there is simply a bimodal
distribution of response levels across trials.

For multiplexing, evaluating the correlation between the activity of multiple neurons on a trial-by-
trial basis, and relating that to behavioral choice/sequence will be essential. We will compute
cross-correlograms as will as spike-field coherence (the relationship between spikes and local
field potentials measured either at the same electrode or at different electrodes; (e.g. Fries et
al., 2001). These measures on double sound trials will be compared to baseline measures
obtained on single sound trials: do two neurons fire more (or less) in tandem than can be
accounted for by the external stimulus/stimuli? Either positive or negative correlations may
occur; either could indicate multiplexing.

Another related data analysis technique we will employ involves Hidden Markov chains. Markov
chains model processes as a series of discrete states that are switched stochastically.
Previous studies have successfully employed this technique on neural data (Abeles et al., 1995;
Seidemann et al., 1996; Jones et al., 2007), determining the time of occurrence of state

D-13
transitions between different firing patterns (the discrete or hidden states). In particular, this
technique will be useful for our data set because the several of the parameters that are normally
free are constrained. We can infer from the single-sound trials what the firing patterns should
be in each of the hidden states, and there should be only two different kinds of hidden states,
corresponding to each of the two sounds.

It is also possible that individual neurons multiplex two sounds, but when the flipping between
firing rates occurs is not correlated across neurons. We will test for this possibility using one
sound that drives the neuron paired with another sound that does not – such as a far right and a
far left target. We will then look for evidence of the neuron’s response pattern flipping between
on and off. An automated burst detection algorithm such as those that have been used for
oculomotor neurons (Kaneko, 2006) will be used. The data from the single sound trials
involving the sound that drives the neuron will be used to set the parameters of the burst
detection (e.g. N spikes within T ms). We will verify that the algorithm’s parameters do not pick
up activity on the non-driving single sound trials, and that they also fail on simulated data sets in
which a random fraction of the spikes from the single driving sound trials are dropped. We will
then test to see if the algorithm does detect short, interspersed bursts on the dual sound trials.

A potential pitfall is that evidence of multiplexing will be substantially easier to detect in neurons
that have sustained firing patterns, because of the longer time window for evaluating neural
activity. An ample percentage of IC neurons (32%, Groh et al., 2003) do show a sustained
component to their responses so we do not anticipate that this will be a major obstacle in the IC.
In the SC, most neurons that exhibit sensory responses to auditory stimuli exhibit sustained
firing patterns, so this should not be a problem in this structure either.

5.8 Timeline: We will begin with the animal training and behavior which is so essential to this
project. We will begin with IC recordings, and then turn to the SC. Due to the time-intensive
demands of recording from awake animals performing a task, and the challenging data analysis
involved in this project, we anticipate that five years of support will be needed.

6. Educational Objectives and Synergy with Research

I have three main educational objectives: (1) educational outreach, in the form of new tools and
demonstrations concerning the perception of sound location for classroom and internet use; (2)
to draw young people into research; and (3) to disseminate scientific findings to the public by
writing for the popular press. In this section I will describe our current aims as well as touch
briefly on the aims of the previous funding period.

6.1. Educational outreach: Sound perception demonstrations. Demonstrations of perceptual


phenomena can be an effective way to engage students. While many visual phenomena lend
themselves to classroom demonstrations in part because they can often be printed on paper,
auditory phenomena have historically lagged behind. However, computers, phones, iPods, and
video games (such as the Wii) have now become very widespread, affording unique
opportunities for disseminating multimodal educational material in an engaging fashion.

I plan to develop educational material that explains how sound localization works. The target
audiences range from primary school through college undergraduate level. Classroom
materials for teachers will be made available on the internet, and on-line demonstrations will be
developed as well.

D-14
Specifically, the material will cover the computational process by which the perception of sound
location arises – through binaural difference and spectral cues present at the eardrum. The
material will go beyond currently available material by presenting the conundrum of how multiple
sounds are detected, relating this issue to neurophysiological evidence, and including
demonstrations in which one either can or cannot detect multiple sounds (e.g. by varying the
correlation of sounds presented from two speakers). I will build on my experience using similar,
but more casual, demonstrations in the perception class that I have been teaching for 10 years.

We will test and refine this material by working with public schools in the Durham area. The
Duke University Office of Community Affairs oversees an outreach program, the Duke-Durham
Neighborhood Partnership, which helps develop joint endeavors between Duke and the public
schools.

6.2. Students in research Synergy between my research and educational objectives comes not
just from bringing research into the classroom but also from bringing students into the
laboratory. Students at a variety of levels participate actively in research in my laboratory. Over
the past year, four undergraduates have conducted research with me, as have five graduate
students and four postdoctoral fellows. Diversity is a priority: ten female students and nine
male students (one of whom is a minority) have conducted undergraduate honors work,
graduate research, or postdoctoral research in my laboratory. I intend to continue to play an
active role in encouraging women and minorities to pursue scientific careers in this branch of
neuroscience, where they are currently very much underrepresented.

6.3 Disseminating findings to a broad audience. A third goal is to convey our scientific
discoveries to a broad, lay audience. The popular press often concentrates on studies that are
not necessarily very novel to scientists. At the same time, studies that truly change the field are
often ignored, because they may be more complicated to understand. I think this problem can
be solved by having more practicing scientists presenting information for the popular press.

I recently wrote a neuroscience article for a lay audience: “How the brain keeps time” (with M. S.
Gazzaniga, Daedalus, Spring, 2001, p. 56-61) (Groh and Gazzaniga, 2003). In this article we
put forth a novel theory for how the brain ensures that its operations proceed in a logical
sequence. In the last year, I participated in two radio programs (CBC’s “Quirks and Quarks”,
and Radio New Zealand’s “Nights”), describing some of our recent findings concerning the
representation of visual information in the inferior colliculus. These rewarding experiences
helped me appreciate the challenges facing science writers in conveying the importance and
implications of scientific discoveries to the public.

I plan to expand my efforts in this area, and write about both my own work and the work of
others in articles that will be broadly accessible. This goal is particularly important when animal
research is involved, so that the public is aware of the vital role that animals play in each and
every biomedical advance.

6.4. Past educational accomplishments under NSF funding. We made progress on all of our
educational goals during the previous funding period, which included mentoring and
dissemination of findings to the public, both of which are ongoing as described above. In
addition, I developed a course on matlab programming for psychology graduate students. This
course was well received and students reported that it greatly aided their ability to analyze their
data in ways that they were previously unable to do.

D-15
E. REFERENCES
Abeles M, Bergman H, Gat I, Meilijson I, Seidemann E, Tishby N, Vaadia E (1995) Cortical
activity flips among quasi-stationary states. Proc Natl Acad Sci U S A 92:8616-8620.
Blauert J (1997) Spatial hearing. Cambridge, MA: MIT Press.
Bulkin DA, Groh JM (2006) Seeing sounds: visual and auditory interactions in the brain. Curr
Opin Neurobiol 16:415-419.
Bulkin DA, Werner-Reiss U, Larue DT, Winer J, Groh JM (Submitted.) Enhanced low frequency
sound representation in the macaque inferior colliculus: a possible acoustic fovea.
Carello CD, Krauzlis RJ (2004) Manipulating intent: evidence for a causal role of the superior
colliculus in target selection. Neuron 43:575-583.
Ebata M, Sone T, Nimura T (1968) On the perception of direction of echo. Journal of the
Acoustical Society of America 44:542-547.
Fries P, Reynolds JH, Rorie AE, Desimone R (2001) Modulation of oscillatory neuronal
synchronization by selective visual attention. Science 291:1560-1563.
Furukawa S, Xu L, Middlebrooks JC (2000) Coding of sound-source location by ensembles of
cortical neurons. J Neurosci 20:1216-1228.
Gardner MB (1968) Historical background of the Haas and-or precedence effect. Journal of the
Acoustical Society of America 43:1243-1248.
Gaskell H (1983) The precedence effect. Hearing Research 12:277-303.
Glimcher PW, Sparks DL (1992) Movement selection in advance of action in the superior
colliculus. Nature 355:542-545.
Groh JM, Gazzaniga MS (2003) How the brain keeps time. Daedalus Spring 2003:56-61.
Groh JM, Pai D, eds (In press) Looking at sounds: neural mechanisms in the primate brain.
Groh JM, Kelly KA, Underhill AM (2003) A monotonic code for sound azimuth in primate inferior
colliculus. Journal of Cognitive Neuroscience 15:1217-1231.
Groh JM, Trause AS, Underhill AM, Clark KR, Inati S (2001) Eye position influences auditory
responses in primate inferior colliculus. Neuron 29:509-518.
Hackett TA, Stepniewska I, Kaas JH (1998) Subdivisions of auditory cortex and ipsilateral
cortical connections of the parabelt auditory cortex in macaque monkeys. J Comp Neurol
394:475-495.
Hofman PM, Van Opstal AJ (1997) Identification of spectral features as sound localization cues
in the external ear acoustics. In: Biological and artificial computation: from neuroscience
to technology., pp 1126-1135. Heidelberg: Springer Berlin.
Jay MF, Sparks DL (1984) Auditory receptive fields in primate superior colliculus shift with
changes in eye position. Nature 309:345-347.
Jay MF, Sparks DL (1987a) Sensorimotor integration in the primate superior colliculus. II.
Coordinates of auditory signals. J Neurophysiol 57:35-55.
Jay MF, Sparks DL (1987b) Sensorimotor integration in the primate superior colliculus. I. Motor
convergence. J Neurophysiol 57:22-34.
Jay MF, Sparks D (1990) Localization of auditory and visual targets for the initiation of saccadic
eye movements. In: Comparative Perception Vol I Basic Mechanisms (Berkley MA,
Stebbins WC, eds), p 527. New York: John Wiley & Sons.
Jenkins WM, Masterton RB (1982) Sound localization: effects of unilateral lesions in central
auditory system. Journal of Neurophysiology 47:987-1016.
Jones LM, Fontanini A, Sadacca BF, Miller P, Katz DB (2007) Natural stimuli evoke dynamic
sequences of states in sensory cortical ensembles. Proc Natl Acad Sci U S A
104:18772-18777.

E-1
Judge SJ, Richmond BJ, Chu FC (1980) Implantation of magnetic search coils for measurement
of eye position: An improved method. Vision Res 20:535-538.
Kaneko CR (2006) Saccade-related, long-lead burst neurons in the monkey rostral pons. J
Neurophysiol 95:979-994.
Keller CH, Takahashi TT (1996) Responses to simulated echoes by neurons in the barn owl's
auditory space map. Journal of Comparative Physiology A Sensory Neural and
Behavioral Physiology 178:499-512.
Keller CH, Takahashi TT (2005) Localization and identification of concurrent sounds in the owl's
auditory space map. J Neurosci 25:10446-10461.
Kim B, Basso MA (2008) Saccade target selection in the superior colliculus: a signal detection
theory approach. J Neurosci 28:2991-3007.
King AJ (2004) The superior colliculus. Curr Biol 14:R335-338.
King AJ, Palmer AR (1983) Cells responsive to free-field auditory stimuli in guinea-pig superior
colliculus: distribution and response properties. J Physiol 342:361-381.
King AJ, Palmer AR (1985) Integration of visual and auditory information in bimodal neurones in
the guinea-pig superior colliculus. Exp Brain Res 60:492-500.
Knudsen EI (1991) Dynamic space codes in the superior colliculus. Current Opinion in
Neurobiology 1:628-632.
Krauzlis R, Dill N (2002) Neural correlates of target choice for pursuit and saccades in the
primate superior colliculus. Neuron 35:355-363.
Krauzlis RJ, Liston D, Carello CD (2004) Target selection and the superior colliculus: goals,
choices and hypotheses. Vision Res 44:1445-1451.
Lindemann W (1986) Extension of a binaural cross-correlation model by contralateral inhibition.
II. The law of the first wave front. Journal of the Acoustical Society of America 80:1623-
1630.
Litovsky RY, Yin TC (1998a) Physiological studies of the precedence effect in the inferior
colliculus of the cat. II. Neural mechanisms. Journal of Neurophysiology 80:1302-1316.
Litovsky RY, Yin TC (1998b) Physiological studies of the precedence effect in the inferior
colliculus of the cat. I. Correlates of psychophysics. Journal of Neurophysiology 80:1285-
1301.
Litovsky RY, Delgutte B (2002) Neural correlates of the precedence effect in the inferior
colliculus: effect of localization cues. Journal of Neurophysiology 87:976-994.
Maier JX, Groh JM (In press) Multisensory guidance of orienting behavior. Hearing Research.
McAlpine D, Jiang D, Palmer AR (2001) A neural code for low-frequency sound localization in
mammals. Nat Neurosci 4:396-401.
Meredith MA, Stein BE (1986a) Visual, auditory, and somatosensory convergence on cells in
superior colliculus results in multisensory integration. Journal of Neurophysiology
56:640-662.
Meredith MA, Stein BE (1986b) Spatial factors determine the activity of multisensory neurons in
cat superior colliculus. Brain Research 365:350-354.
Metzger RR, Greene NT, Porter KK, Groh JM (2006) Effects of reward and behavioral context
on neural activity in the primate inferior colliculus. J Neurosci 26:7468-7476.
Metzger RR, Mullette-Gillman OA, Underhill AM, Cohen YE, Groh JM (2004) Auditory saccades
from different eye positions in the monkey: implications for coordinate transformations. J
Neurophysiol 92:2622-2627.
Mickey BJ, Middlebrooks JC (2001) Responses of auditory cortical neurons to pairs of sounds:
correlates of fusion and localization. J Neurophysiol 86:1333-1350.
Mickey BJ, Middlebrooks JC (2005) Sensitivity of auditory cortical neurons to the locations of
leading and lagging sounds. J Neurophysiol 94:979-989.
Middlebrooks JC, Knudsen EI (1984) A neural code for auditory space in the cat's superior
colliculus. J Neurosci 4:2621-2634.

E-2
Middlebrooks JC, Clock AE, Xu L, Green DM (1994) A panoramic code for sound location by
cortical neurons. Science 264:842-844.
Middlebrooks JC, Xu L, Eddins AC, Green DM (1998) Codes for sound-source location in
nontonotopic auditory cortex. J Neurophysiol 80:863-881.
Moore BCJ (1997) An introduction to the psychology of hearing. New York.: Academic Press.
Moore T, Fallah M (2001) Control of eye movements and spatial attention. Proc Natl Acad Sci U
S A 98:1273-1276.
Moschovakis AK (1996) The superior colliculus and eye movement control. Current Opinion in
Neurobiology 6:811-816.
Mullette-Gillman OA, Cohen YE, Groh JM (2005) Eye-centered, head-centered, and complex
coding of visual and auditory targets in the intraparietal sulcus. J Neurophysiol 94:2331-
2352.
Mullette-Gillman OA, Cohen YE, Groh JM (2008) Motor-related signals in the intraparietal cortex
encode locations in a hybrid, rather than eye-centered, reference frame. Cereb Cortex
Epub Dec 9.
Munoz DP (2002) Commentary: saccadic eye movements: overview of neural circuitry. Prog
Brain Res 140:89-96.
Perrott DR (1984) Concurrent minimum audible angle: a re-examination of the concept of
auditory spatial acuity. J Acoust Soc Am 75:1201-1206.
Perrott DR, Briggs R, Perrott S (1970) Binaural fusion: its limits as defined by signal duration
and signal onset. Journal of the Acoustical Society of America 47:565-568.
Populin LC, Yin TCT (1996) Modulation of auditory cells by eye and ear position in the superior
colliculus of the behaving cat. 26th Annual Meeting of the Society for Neuroscience,
Washington, DC, USA, November 16-21, 1996 Society for Neuroscience Abstracts
22:649.
Porter KK, Groh JM (2006) The other transformation required for visual-auditory integration:
representational format. Prog Brain Res 155:313-323.
Porter KK, Metzger RR, Groh JM (2006) Representation of eye position in primate inferior
colliculus. J Neurophysiol 95:1826-1842.
Porter KK, Metzger RR, Groh JM (2007) Visual- and saccade-related signals in the primate
inferior colliculus. Proc Natl Acad Sci U S A 104:17855-17860.
Robinson D (1963) A method of measuring eye movement using a scleral search coil in a
magnetic field. IEEE Trans Biomed Eng 10:137-145.
Seidemann E, Meilijson I, Abeles M, Bergman H, Vaadia E (1996) Simultaneously recorded
single units in the frontal cortex go through sequences of discrete and stable states in
monkeys performing a delayed localization task. J Neurosci 16:752-768.
Sparks DL, Mays LE (1990) Signal transformations required for the generation of saccadic eye
movements. Annu Rev Neurosci 13:309-336.
Takahashi TT, Keller CH, Nelson BS, Spitzer MW, Bala AD, Whitchurch EA (2008) Object
localization in cluttered acoustical environments. Biol Cybern 98:579-586.
van Opstal AJ, van Gisbergen JAM (1990) Role of monkey superior colliculus in saccade
averaging. Exp Brain Res 79:143-149.
Wallace MT, Stein BE (1996) Sensory organization of the superior colliculus in cat and monkey.
Prog Brain Res 112:301-311.
Wallace MT, Wilkinson LK, Stein BE (1996) Representation and integration of multiple sensory
inputs in primate superior colliculus. J Neurophysiol 76:1246-1266.
Wallace MT, Meredith MA, Stein BE (1998) Multisensory integration in the superior colliculus of
the alert cat. J Neurophysiol 80:1006-1010.
Wallach H, Newman EB, Rosenzweig MR (1949) The precedence effect in sound localization.
Am J Psychol 52:315–336.

E-3
Werner-Reiss U, Groh JM (2008) A rate code for sound azimuth in monkey auditory cortex:
implications for human neuroimaging studies. J Neurosci 28:3747-3758.
Werner-Reiss U, Porter KK, Underhill AM, Groh JM (2006) Long lasting attenuation by prior
sounds in auditory cortex of awake primates. Exp Brain Res 168:272-276.
Werner-Reiss U, Kelly KA, Trause AS, Underhill AM, Groh JM (2003) Eye position affects
activity in primary auditory cortex of primates. Curr Biol 13:554-562.
Zwiers MP, Versnel H, Van Opstal AJ (2004) Involvement of monkey inferior colliculus in spatial
hearing. J Neurosci 24:4145-4156.

E-4

You might also like