J Neurophysiol 114: 1272–1285, 2015. First published July 1, 2015; doi:10.1152/jn.00214.2015.
Behavior and modeling of two-dimensional precedence effect in headunrestrained cats Yan Gai,1,2 Janet L. Ruhland,1 and Tom C. T. Yin1 1 Department of Neuroscience, University of Wisconsin, Madison, Wisconsin; and 2Department of Biomedical Engineering, Saint Louis University, St. Louis, Missouri
Submitted 3 March 2015; accepted in final form 29 June 2015
Gai Y, Ruhland JL, Yin TC. Behavior and modeling of twodimensional precedence effect in head-unrestrained cats. J Neurophysiol 114: 1272–1285, 2015. First published July 1, 2015; doi:10.1152/jn.00214.2015.—The precedence effect (PE) is an auditory illusion that occurs when listeners localize nearly coincident and similar sounds from different spatial locations, such as a direct sound and its echo. It has mostly been studied in humans and animals with immobile heads in the horizontal plane; speaker pairs were often symmetrically located in the frontal hemifield. The present study examined the PE in head-unrestrained cats for a variety of pairedsound conditions along the horizontal, vertical, and diagonal axes. Cats were trained with operant conditioning to direct their gaze to the perceived sound location. Stereotypical PE-like behaviors were observed for speaker pairs placed in azimuth or diagonally in the frontal hemifield as the interstimulus delay was varied. For speaker pairs in the median sagittal plane, no clear PE-like behavior occurred. Interestingly, when speakers were placed diagonally in front of the cat, certain PE-like behavior emerged along the vertical dimension. However, PE-like behavior was not observed when both speakers were located in the left hemifield. A Hodgkin-Huxley model was used to simulate responses of neurons in the medial superior olive (MSO) to sound pairs in azimuth. The novel simulation incorporated a lowthreshold potassium current and frequency mismatches to generate internal delays. The model exhibited distinct PE-like behavior, such as summing localization and localization dominance. The simulation indicated that certain encoding of the PE could have occurred before information reaches the inferior colliculus, and MSO neurons with binaural inputs having mismatched characteristic frequencies may play an important role. localization; echo threshold; medial superior olive; head-related transfer function; free field
is often challenging in real life due to the presence of background noise or reverberation. The precedence effect (PE) demonstrates that the auditory system has evolved to ignore reflected sound copies and attend to the direct veridical sound sources. Experimentally, PE-like behaviors are usually examined in nonreverberant sound chambers using accurately timed sound sources to mimic sound reflections. In addition, the majority of behavioral studies were conducted in animal or human subjects with immobile heads during the presentation of paired sound sources. One could argue that restricting head movements may help achieve consistent behavior and enhance stimulus control. On the other hand, in real life our heads can move freely while we perceive sound locations. Thus any echo-suppressing mechanism would be less useful if it only works under head-restrained conditions.
LOCALIZATION OF SOUND SOURCES
Address for reprint requests and other correspondence: Y. Gai, Dept. of Biomedical Engineering, Saint Louis Univ., 3507 Lindell Blvd., Ste. 2007, St. Louis, MO 63103 (e-mail:
[email protected]). 1272
In psychophysical experiments involving the PE, the interstimulus delay (ISD) between two sound sources is an important parameter. A widely studied feature of PE or echo suppression is a phenomenon called localization dominance, i.e., the perception and localization of only the leading sound. Localization dominance can happen over a large range of ISDs. Typically, for shorter ISDs such as ISDs ⱕ0.4 ms in cats (Dent et al. 2009) or ⬍0.8 ms in humans (Blauert 1997), a different type of behavior can be observed. Summing localization, sometimes referred to as fusion, describes the perception of a fused sound in between the two actual target locations. Experimentally, the demonstration of a “phantom” sound in between two largely separated sources with small ISDs can be interesting. However, this scenario presumably happens infrequently in real life with single sound sources, because if the echo comes quickly after the direct sound (i.e., a small ISD), the echo location, as well as the perceived location, is likely to be close to the direct sound location. When the ISD exceeds a certain value, such as 8⫺10 ms for humans (Stecker and Hafter 2002; Agaeva 2011) and cats (Tollin and Yin 2003b; Dent et al. 2009), both the direct sound and its echoes can be perceived. Because it is difficult for animal subjects to indicate that two separate sounds are perceived, one way to behaviorally measure the echo threshold is to examine whether the lagging sound can be localized on a significant number of occasions (Blauert 1997; Tollin and Yin 2003b). For horizontal localization, the principal cues are interaural time differences (ITDs) created by different arrival times of the sound at the two ears, and interaural level differences created when the head attenuates sound at the ear contralateral to the sound source (Yin 2002). The majority of psychophysical experiments studying the PE have been conducted with speakers along the azimuthal dimension. In the vertical dimension where spectral cues dominate sound localization (Gardner and Gardner 1973; Hebrank and Wright 1974), the PE has been measured in the frontal median sagittal plane in humans (Litovsky et al. 1997; Dizon and Litovsky 2004; Agaeva 2011) and cats (Tollin and Yin 2003a, 2003b), all with immobile heads. In these studies, behaviors that resemble localization dominance, but not summing localization, were reported. None of those studies tested diagonal conditions in which two paired sound sources differ in both horizontal and vertical locations. In addition, very few studies have tested sound pairs both located in the lateral hemifield (flies: Lee et al. 2009). Thus, the PE has been studied under conditions that were experimentally limited. The present behavioral study aimed to extend the study to more natural and diverse situations; that is, when the head can move freely, when sound localization
0022-3077/15 Copyright © 2015 the American Physiological Society
www.jn.org
PRECEDENCE EFFECT IN CATS
involves both horizontal (i.e., timing and level) and vertical (i.e., spectral) cues, and when the head is not oriented toward a location in between the direct sound and its echo. Physiological recordings using paired sound sources have been conducted to examine the suppressive effect of the leading sound on the neural responses to the lagging sound (Yin 1994; Fitzpatrick et al. 1995; Litovsky and Yin 1998a, 1998b; Mickey and Middlebrooks 2001; Spitzer et al. 2004; Tollin et al. 2004; Dent et al. 2005, 2009). Those recordings were typically obtained from low-frequency neurons that are sensitive to ITDs in the inferior colliculus (IC) or the auditory cortex. Most of these studies can demonstrate localization dominance, with a few illustrating neural responses consistent with summing localization (Yin 1994; Mickey and Middlebrooks 2001). In contrast, theoretical (Schwartz et al. 1999; Braasch 2013) and biophysical (Xia et al. 2010) models have focused on explaining the localization dominance and deriving echo thresholds. Recent physiological (Day and Semple 2011; Benichoux et al. 2015) and modeling (Joris et al. 2006) studies suggest that neurons in the medial superior olive (MSO) may receive binaural inputs with mismatched frequencies. Here we show that, with a low-threshold potassium current acting as an echo-suppressing mechanism, MSO cells with such mismatched inputs can demonstrate behaviors correlated to the PE. The biophysical model for MSO neurons used here was modified from a cochlear-nucleus bushy-cell model (Rothman and Manis 2003), with input from an auditory nerve (AN) model (Zilany et al. 2009). MSO is the first binaural stage that exhibits sharp tuning to ITDs. Comparisons between the performance of MSO and IC models made by a previous study (Xia et al. 2010) indicated that the former is insufficient in explaining echo thresholds, whereas the latter, with delayed inhibition to suppress responses to the lagging sound, can better predict the echo thresholds. Our novel simulation approach using low-threshold potassium current and mismatched input frequencies can demonstrate summing localization, localization dominance, and the perception of echoes (two sounds). The goal was not to explain the behavioral echo thresholds, but to illustrate how much of the PE can be explained at the level of the MSO, mostly due to the presence of the potassium current as a delayed rectifier, without invoking central inhibition.
1273
with a resolution of ⬍1°. Speakers (model 40-1310B; Radio Shack, Fort Worth, TX) and light emitting diodes (LEDs) (2.0-mm-diameter red LEDs, max ⫽ 635 nm) were placed in the frontal hemifield between ⫾50° in azimuth and ⫾30° in elevation, except for the condition in which both speakers were located in the left hemifield. Speakers were hidden from view by a black translucent cloth. An LED was suspended over the center of each speaker, which, when illuminated, could be easily seen through the cloth. Tucker-Davis Technologies System III was used for stimulus generation and data collection. For non-PE trials, a single speaker was selected from a pool of 17 speakers, including a central speaker (azimuth: 0°, elevation: 0°), six speakers in the horizontal plane, six speakers in the sagittal plane, and four speakers along the diagonals (⫾20°,⫾20°). Note that some of the trials presented a visual stimulus by an LED mounted at the center of each speaker. The majority of non-PE trials were excluded from the analysis of this study. Those included will be referred to as control or single-source trials. For PE trials, except for the ones in the left hemifield, pairs of speakers were always symmetrical relative to the median sagittal plane or the horizontal plane. In each trial, speakers were selected according to Fig. 1A. Horizontal target pairs were located at (⫾50°,0°); vertical target pairs were located at (0°,⫾20°) or (0°,⫾30°); diagonal pairs were (⫺20°,⫹20°) paired with (⫹20°,⫺20°), or (⫺20°,⫺20°) paired with (⫹20°,⫹20°) (Fig. 1A). By convention, the pair ⫾20°, ⫾20° refers to the upper right speaker ⫹20°, ⫹20° and the lower left speaker (⫺20°,⫺20°); similarly, the pair (⫾20°,⫿20°) refers to the lower right speaker (⫹20°,⫺20°) and the upper left speaker (⫺20°,⫹20°). To test with speakers in one hemifield, the animal was turned sideways, i.e., facing the rightmost speaker. In this condition, the two paired speakers, both located in the left hemifield of the cat, were at (⫺60°,0°) and (⫺120°,0°). Behavioral paradigm, acoustic stimuli, and data analysis. Four cats were trained by operant conditioning with food reward and had all participated in previous auditory and visual tasks. Eye coils were
METHODS
Subjects and experimental setup. All surgical and experimental procedures complied with the guidelines of the University of Wisconsin Animal Care and Use Committee and the National Institutes of Health. The protocol involving operations on cats was approved by the University of Wisconsin Animal Care and Use Committee. Four adult female cats were implanted with a stainless steel coil (S170012A7-FEP; 137 Alan Baird Industries, Ho-Ho-Kus, NJ) under the conjunctiva of the eye during aseptic surgeries to measure eye position. The coils were threaded subcutaneously and attached to connectors that were embedded in a dental acrylic cap on the skull. For all of the behavioral responses, we measured gaze position, or eye position in space, which was usually accomplished by a head movement and an eye-in-head movement. The behavioral experiment was conducted in a dimly lit, soundattenuated chamber (2.2 ⫻ 2.5 ⫻ 2.5 m). The eye coils were connected to a magnetic search coil system that enabled the vertical and horizontal components of the eye movements to be monitored
Fig. 1. Acoustic stimuli and speaker-pair configurations. A: locations of speaker pairs indicated by the same color. There are five stimulus conditions, each containing a pair of speakers along the horizontal, vertical, or diagonal dimension. B: 5-Hz noise-burst trains with various interstimulus delays (ISDs) across the two speakers. The noise was repeated five times with interburst intervals of 200 ms. The duration of each noise burst was 10 ms. The experiment was also repeated with 30-ms bursts (a different noise token) only for one vertical condition, V (0°,⫾30°). C: a single 50-ms noise burst with various ISDs.
J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
1274
PRECEDENCE EFFECT IN CATS
calibrated with a behavioral procedure (Populin and Yin 1998; Tollin et al. 2005) that relied on the natural instinct of the cat to look at a light source that suddenly appears in the visual field. Gaze positions were absolute positions in space. The vertical and horizontal components of gaze positions were fit separately with linear equations. The voltage output of the coil system was converted to visual angles using coefficients obtained from the fitting procedure. Cats conducted the experiment with heads unrestrained and an external feeding tube attached to the head post so the reward could be delivered expeditiously. At the beginning of a trial, the cat was trained to fixate an LED presented from straight ahead (0°,0°) and maintain gaze fixation within the acceptance window (a square window of ⫾6°) for a variable period of time (600-1,000 ms). On non-PE trials, including the control (single-source) trials, a single acoustic or visual signal was presented from 1 of the 17 speaker/LED locations described above when the LED was extinguished. The cat was required to make a gaze saccade to the apparent location of the signal. If during the 600-1,000 ms following the offset of the LED the gaze moved to and remained within a specified acceptance window (a square window of ⫾12°) around the target location, a food reward was given. See Populin and Yin (1998) for a discussion of the strategy for setting the size of reward windows. Data were analyzed regardless of whether a reward was received. On PE trials, acoustic signals were presented from pairs of speakers with 1 of 23 ISDs covering ⫺30 to 30 ms. Positive ISD corresponded to right or upper targets leading. Importantly, precedence trials were presented randomly at a probability of only 5⫺10% with the rest of the trials, i.e., control trials and other types of acoustic or visual trials. Because the precedence stimuli may produce an illusory perceptual location and there is no correct location of the target, all precedence trials were rewarded. Overall, 10 trials or more were obtained for each ISD under each condition except in a few rare cases. The acoustic target was either a broadband (BB, 0.1⫺30 kHz) noise-burst train repeated at 5 Hz with a total duration of 810 or 830 ms (Fig. 1B) or a 50-ms BB noise (Fig. 1C). The duration of the noise in the noise-burst train was typically 10 ms; the experiment was also repeated with a 30-ms bursts for one vertical pair (0,⫾30°). Stimuli used for precedence trials had a fixed level of 50 dB sound pressure level (SPL). The overall sound level of control and other auditory stimuli was roved from 50 to 80 dB SPL in steps of 2 dB. Horizontal and vertical gaze positions were determined separately by a velocity criterion (Populin and Yin 1998; Tollin et al. 2005). The beginning of the gaze movement was marked as the point in time when steady fixation ended, i.e., when the magnitude of the velocity exceeded 2 SDs from the mean velocity computed during the initial steady fixation. The final gaze position was determined at the time when the magnitude of the velocity returned to ⱕ2 SDs of the baseline velocity. The gaze latency was the time when the fixation ended (i.e., when the gaze movement started) relative to the beginning of the target. For trials with no saccades, either caused by perception at the front center or by failure of localization or detection, the final gaze position was measured at a fixed latency of 600 ms poststimulus onset (usually close to 0°); those trials were excluded in computing the response latency. Model simulation. We used a biophysical MSO model adapted from the bushy-cell model by Rothman and Manis (2003) to simulate the PE. Details of the cellular model and parameter values can be found in Gai et al. (2009). The model contains a fast sodium current (INa), a high-threshold (IKHT) and a low-threshold (IKLT) potassium current, a hyperpolarization-activated cation current (Ih), and a leak current (Ilk). Cm
dVm dt
⫽ ⫺INa ⫺ IKHT ⫺ IKLT ⫺ Ih ⫺ Ilk ⫹ Is共t兲
(1)
Vm is the membrane voltage. Is(t) is the current input. Membrane capacitance (Cm) ⫽ 12 pF. In Rothman and Manis (2003), the original
conductances and channel time constants are values for the temperature of 22°C; here those values were multiplied by a factor of 3.03 and 0.17, respectively, to mimic the condition of 38°C. The most relevant feature of the model was the IKLT, which can act as a slow negative feedback to suppress delayed input. The MSO model received spiking input from a phenomenological AN model (Zilany et al. 2009), which transformed sound input into spike times. The sound input to both “sides” of the AN model was the 50-dB, 5-Hz, 10-ms noise burst train that was used in the behavioral experiment. To simulate the three observed PE-like behaviors of summing localization, localization dominance, and echo perception/ separation, each MSO model cell received conductance input from 10 independent high-spontaneous-rate AN-model fibers for each side; the cochlear nucleus was modeled as a relay in the simulation, as has been done in a number of MSO modeling studies (Colburn et al. 2009; Jercog et al. 2010; Wang et al. 2014b). For each side of the AN model, the 10 model fibers had the same characteristic frequency (CF), whereas the ipsilateral CF can differ from the contralateral CF. To introduce a variety of ITD tunings, i.e., different best/peak ITDs, we introduced mismatched CFs from the contralateral and ipsilateral side to make use of the traveling-wave delays that naturally occur across frequencies (Joris et al. 2006). Differences in conduction delays as in the Jeffress (1948) model from the two sides could also be used to introduce variability in peak ITDs, but these were not implemented in this model. Figure 2B (solid) shows a typical ITD tuning curve in response to a click train (peak level ⫽ 70 dB SPL; interclick interval ⫽ 80 ms; a total of 50 clicks) with input of matched CFs (500 Hz). The tuning properties were similar to those shown for the MSO model by Xia et al. (2010). When the same model was simulated with mismatched CFs (the ipsilateral CF ⫽ 520 Hz; the contralateral CF ⫽ 480 Hz; a mismatch of 8%), a clear shift of best ITD toward the contralateral side (ITD ⫽ ⫹200 s) was observed (Fig. 2B, dashed). This value was in line with those recorded in pairs of auditory nerves by Joris et al. (2006) when scaled according to the mismatch of frequencies.
RESULTS
Stereotypical PE was observed for target pairs separated in azimuth. Figure 3 shows sample horizontal eye movements to single (A) and paired (B–D) horizontal sound sources located at (⫾50°,0°) obtained with cat 38. When single sound sources were presented, the final gaze positions (A, red and blue traces) were close to the target locations (marked by the arrows on the
Fig. 2. Computational model of the medial superior olive (MSO) and interaural time difference (ITD) tuning curve. A: model structure. Binaural sound with an ITD served as the input to an auditory nerve (AN) model, which generated spiking output. The cochlear nucleus (CN) was modeled as a simple relay. B: ITD tuning curve of the model in response to a single click train. The peak level of the click was 70 dB sound pressure level (SPL). The interclick interval was 80 ms. The solid curve was obtained with matched characteristic frequencies (CFs) of 500 Hz. The dashed curve was obtained with mismatched CFs (the ipsilateral CF ⫽ 520 Hz; the contralateral CF ⫽ 480 Hz). Positive ITDs correspond to the contralateral ear leading in time.
J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
PRECEDENCE EFFECT IN CATS
Fig. 3. Representative horizontal eye movements in response to single or paired sound sources varying in azimuth, H (⫾50°,0°), obtained with cat 38. The two speaker locations are indicated by the arrows. In A (control with a single source), a 5-Hz noise from a single speaker, either at (⫹50°,0°) or (⫺50°,0°), was presented in each trial. In B–D, the two speakers played the same 50-ms noise with various ISDs. A positive ISD was created when the speaker on the right side, i.e., speaker 1 (50°,0°), was leading (red). Time 0 is the beginning of the leading sound.
right axis). When the two sound sources were presented (B–D), the gaze responses varied with the ISDs. First, with zero ISD, the cat looked at a place in between the two sources (Fig. 3B), although biased to the right for this cat (mean ⫽ 19.6°; SD ⫽ 7.7°). This behavior agrees with the so-called summing localization generally found in PE studies
1275
in response to small ISDs (e.g., |ISD| ⬍ 0.4 ms, Tollin and Yin 2003b). Second, when ISD ⫽ ⫾0.5 ms, the animal always looked at the leading sound source (Fig. 3C), as accurately and consistently as the response to single sound sources (Fig. 3A), although there are some differences in the kinematics of the gaze movements. This response is consistent with localization dominance. Third, when ISD was further increased to ⫾30 ms, the cat no longer consistently looked at the leading sound sources (Fig. 3D). Whether the leading source was on the left (blue arrow) or the right (red arrow), the cat made gaze movements to both sources, and the variance in the final gaze positions was larger than what was observed during localization dominance (Fig. 3C). A plausible explanation of the behavior was that the ISD had exceeded the echo threshold, that is, the animal could detect and localize both the leading and the lagging sound sources separately, but was uncertain how to respond. Figure 4 summarizes the final horizontal (top) and vertical (bottom) gaze positions for paired horizontal sound sources located at (⫾50°,0°) as a function of ISD for all three cats. The solid blue lines represent responses to pairs of 5-Hz noise burst trains (duration of 10 ms), and the red lines represent responses to pairs of nonrepeated 50-ms noise tokens. All three cats showed clear localization dominance for some intermediate ISD values (e.g., |ISD| between 0.5 and 10 ms), that is, the cat consistently looked at the leading sound sources, and the performance was comparable to the performance for singlesource sounds (shaded areas). For smaller ISDs, cats 36 and 38 persistently looked in between the two sound sources as if perceiving a single phantom sound, although the error bars (SDs) were generally larger than the error bars during localization dominance (Fig. 4, A and B, top). Although the mean gaze responses of cat 33 also lay in between the two target locations, the variance was large, indicating that either this cat failed to perceive a completely fused image or the perception
Fig. 4. Final gaze positions as a function of ISD for paired sound sources varying in azimuth, H (⫾50°,0°). Symbols represent mean values, and the error bars are SDs. At least 10 trials were obtained for each condition except in rare occasions. The two speaker locations are indicated by the arrows on the y-axis on the right. The sound was either a single 50-ms noise (red) or a 10-ms noise repeated 5 times during 810 ms (dark blue). A positive ISD corresponds to “right-source leading,” i.e., when speaker 1 (50°,0°) was leading. The shaded areas are control responses (mean ⫾ SD) to single-sourced sound. The dashed line in light blue illustrates the average performance obtained in head-restrained cats (Tollin and Yin 2003b). J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
1276
PRECEDENCE EFFECT IN CATS
of a single sound source varied across trials. Because the gaze distribution was not bimodal, it was unlikely that the cat detected both sound sources at those small ISDs. At larger ISDs, e.g., ⬎10 ms, where the echo threshold was presumably exceeded, response variance was even larger. All three cats showed biases toward one side (either left or right) for both types of sound stimuli, except for cat 38 with the 50-ms noise (Fig. 4B, top, red). It is also worth pointing out that, for the two cats that were tested with both types of stimuli, the positive slopes over the range of summing localization (around 0 ISD) were steeper for the 5-Hz burst train than the 50-ms nonrepeated noise, indicating different thresholds of ISDs for reaching localization dominance. For comparison, a typical head-restrained plot was illustrated by the dashed light blue line (Tollin and Yin 2003b). Note that they had a 36° angular separation between the speaker pairs, whereas we had a 100° separation. If the angles were normalized, the slope measured from the head-restrained study would be similar to the shallower slope obtained here with the 50-ms nonrepeated noise. The implication will be addressed in the DISCUSSION. Because the two targets were located in the horizontal plane, in elevation the three cats showed little gaze movement except at very small ISDs for cats 36 and 38 (Fig. 4, A and B, bottom). Interestingly, for these two cats, the ISDs that evoked nonzero vertical movements roughly corresponded to the ISDs with which summing localization occurred. In contrast, the cat that did not show a typical summing-localization behavior (cat 33) had minimal vertical responses (Fig. 4C, bottom). The vertical deviation of the perceived source during trials with ISDs in the summing localization range has been previously reported for head-restrained conditions and may reflect a spectral cue imposed by the slightly delayed click due to comb filtering (Tollin and Yin 2003a). An unpredicted observation with cat 38 was that, for the 5-Hz pulse train when ISD ⫽ ⫹0.075 ms (i.e., the sound on the right was leading the sound on the left by 0.075 ms), for some reason the cat chose to look at the lagging sound (Fig. 4B, blue). This does not appear to be an artifact because 1) we used the same sound for all three cats, so it is not likely to be a stimulus artifact, and 2) this phenomenon was again observed in the horizontal movements of this cat with ISD ⫽ 0.075 ms with sound pairs located diagonally (see below). It was unclear, however, why this behavior occurred with this particular cat for this ISD value and not with the other subjects. Stereotypical PE was absent for targets in the median sagittal plane. Figure 5 shows sample vertical eye movements to single (A) and paired (B–D) vertical sound sources located at (0°,⫾30°) obtained with cat 33. The results were similar with closer targets (0°,⫾20°) (data not shown). For single sources located at (0°,⫾30°), the gaze responses generally clustered near the true target locations (arrows) with a bit of undershoot, although with larger variance in the final gaze positions (Fig. 5A) compared with the horizontal control conditions (Fig. 3A). In contrast, responses to paired vertical sources were notably scattered (Fig. 5, B–D). There were no clear responses that resembled either summing localization or localization dominance (Fig. 5, B and C). For large ISDs, the vertical responses (Fig. 5D) were highly scattered, and it was difficult to define an echo threshold because the responses to smaller ISDs did not exhibit clear localization dominance (Fig. 5C).
Fig. 5. Representative vertical eye movements in response to single (A) or paired (B-D) sound sources varying in elevation, V (0°,⫾30°), plotted in the same format as Fig. 3. The data were obtained with cat 33 using the 5-Hz noise. A positive ISD was created when the speaker on the top, i.e., speaker 1 (0°,30°), was leading.
Tollin and Yin (2003b) observed that the cat sometimes made an initial saccade to the leading or the lagging sound, followed by a corrective saccade to the lagging or leading sound. They attributed this behavior to the perception of two separate sound sources, but the behavior happened only on some trials. In the present study we observed this behavior less frequently; for example, no such trials occurred for the azimuthal example (Fig. 3). This behavior was observed at both short and long ISDs for the vertical example (Fig. 5, C and D); more importantly, it occurred only when the initial saccade was toward the sound located at a higher position, whether it was the leading or lagging source. Therefore, we do not think that the double saccades were a reliable measurement of echo thresholds. Figure 6 summarizes the final gaze positions for paired vertical sound sources located at (0°,⫾30°) as a function of ISD for all three cats. Because sound localization in elevation, but not in azimuth, improves with stimulus duration (Hartmann and Rakerd 1993; Macpherson and Middlebrooks 2000; Gai et al. 2013), we wanted to examine whether the lack of PE-like behaviors observed above was due to insufficient duration of sound presentation. Thus, an extra condition was added to cats 36 and 38 by extending the duration of each noise burst in the 5-Hz train from 10 to 30 ms (green). Similar to the example shown in Fig. 5, vertical gaze responses to all three types of stimuli had large variance and no clear patterns that can be classified as PE. PE can be observed in both azimuth and elevation for diagonally located targets. Figures 7 and 8 show the final horizontal (top) and vertical (bottom) gaze movements to paired sound sources located diagonally. The speaker pairs were located at (⫾20°,⫿20°) (Fig. 7) or (⫾20°,⫾20°) (Fig. 8). For the horizontal component (top), the patterns were similar to what was observed with horizontal targets (⫾50°,0°) (Fig. 4,
J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
PRECEDENCE EFFECT IN CATS
1277
Fig. 6. Final gaze positions for paired sound sources varying in elevation, V (0°,⫾30°). The format is the same as in Fig. 4 except that an extra condition was obtained with a 30-ms noise repeated 5 times during 830 ms (green) for cat 36 (A) and cat 33 (C). A positive ISD corresponds to “upper-source leading,” i.e., when speaker 1 (0°, 30°) was leading.
top), e.g., there was clear localization dominance at intermediate ISDs, some degree of summing localization at small ISDs, and apparent echo thresholds at the largest ISDs. Interestingly, although the vertical responses (Figs. 7 and 8, bottom) were still less stereotyped than the horizontal components (Figs. 7 and 8, top), they were clearly more reliable than responses to pairs in the sagittal plane (Fig. 6, bottom) and indeed exhibited certain features of PE. For example, cats 36 and 33 showed clear localization dominance at some intermediate ISDs, although no responses can be classified as summing localization (Figs. 7 and 8, A and C, bottom). At larger ISDs, the dominance of the leading sound broke down, and the cats showed bias to one of the two speakers. Cat 38 showed similar, although weaker, patterns with the 5-Hz pulse train but not
always with the 50-ms nonrepeated noise (Figs. 7B and 8B, bottom). Echo thresholds were better defined using pulse trains. The echo threshold is the shortest ISD at which two separate sounds, rather than one fused sound, can be localized. Because we were unable to instruct the cats to signal the perception of two sound sources with our experimental setup, we assume that the echo threshold had been reached when, on a certain number of trials, the cat made a saccade to the position of the lagging sound (Blauert 1997; Tollin and Yin 2003b). Note that sometimes the cat made double saccades, including sequential saccades in opposite directions (examples shown in Fig. 5, C and D). The analyses presented so far were all based on the final response, i.e., gaze movement at the end of the final
Fig. 7. Final gaze positions for paired sound sources varying in diagonal, D (⫾20°,⫿20°). The format is the same as in Fig. 4. Note that, to associate positive ISDs with right-source leading and upper-source leading, every symbol with a positive ISD in the horizontal-gaze plot (top) has its corresponding vertical position as a negative ISD (bottom) for speaker 1 (⫹20°,⫺20°) leading and vice versa for speaker 2 (⫺20°,⫹20°). J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
1278
PRECEDENCE EFFECT IN CATS
Fig. 8. Final gaze positions for paired sound sources varying in diagonal, D (⫾20°,⫾20°). The format is the same as in Fig. 4. A positive ISD in both top and bottom plots corresponded to right-source leading and upper-source leading, i.e., when speaker 1 (⫹20°,⫹20°) was leading.
saccade (we rarely observed more than two saccades with the 4 cats in this study). For the echo-threshold measurement, we used the approach developed by Tollin and Yin (2003b), which only analyzes the initial response. This is not to say that the second saccade is irrelevant for echo-threshold measurements, but to point out that even the first saccade can go in either direction when an echo threshold is reached. For each cat, we first computed the mean horizontal (or vertical if analyzing vertical pairs or the vertical aspect of diagonal pairs) eye movements toward the location of the leading source, regardless of whether the source was on the right or on the left (or up or down for the paired sources in the vertical plane). As a result, responses toward the location of the leading sound had positive values, and responses away had negative values. Next we normalized the orienting responses to the single or paired sources by the mean response to the single sources at the two leading locations. An example of the horizontal normalized response as a function of ISD for speaker pairs located at (⫾50°,0°) is plotted in Fig. 9 (solid lines). Briefly, a normalized response of 1 indicated that the cat’s initial saccade was always directed to the leading sound source, i.e., localization dominance; a nor-
malized response of 0 indicated that the cat either always looked at the middle or at positions toward the leading and the lagging sound sources with equal probabilities. A value of ⫺1 would mean that the cat always responded to the lagging sound, which was never seen. Because only intermediate and large ISDs (e.g., ⱖ2 ms) were plotted (Tollin and Yin 2003b), for the horizontal condition the lines started around 1 (in the regime of localization dominance) and gradually decreased at large ISDs (Fig. 9). The echo threshold was estimated at the interpolation point where the normalized response was 0.5 (Dent et al. 2009). The echo thresholds derived this way with the 5-Hz pulse train were 14, 21, and 17 ms for cats 36, 38, and 33, respectively. With the 50-ms nonrepeated noise, the normalized response barely reached the threshold at the largest ISD tested for cat 36 but never reached the threshold for cat 38. Table 1 summarizes the echo thresholds for all the conditions. For horizontal gaze movements (top), an echo threshold can be obtained using the 5-Hz pulse train in all but one condition (cat 38, last row, marked by X). The values varied between 10 and 21 ms. In contrast, no echo threshold was available in most conditions for the 50-ms nonrepeated noise. When the normalized vertical
Fig. 9. Normalized gaze responses as a function of ISD. ⫹1 represents looking at the leading sound source, and ⫺1 represents looking at the lagging source, regardless of which speaker led. The blue and red solid lines represent the horizontal condition (⫾50°,0°), marked as H50. The ISD that corresponded to the normalized value of 0.5 was interpolated and used as an approximation of the echo threshold. Normalized gaze response for the vertical condition (0°,⫾30°), marked as V30, was also shown for cat 33 as one example. J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
PRECEDENCE EFFECT IN CATS
Table 1. Echo thresholds obtained using normalized gaze movements Speaker Pair
Cat 36
(⫾50°, 0°) (⫾20°, ⫿20°) (⫾20°, ⫾20°)
14 (30) 10 (10) 14 (X)
(0°, ⫾30°) (⫾20°, ⫿20°) (⫾20°, ⫾20°)
X (X) 9 (X) 4 (X)
Cat 38 Horizontal Echo Thresholds, ms 21 (X) 16 (19) X (X) Vertical Echo Thresholds, ms X (X) X (X) X (X)
Cat 33
17 (X) 21 (X) 14 (X) X (X) 21 (X) 6 (X)
Echo thresholds obtained with horizontal (top) and vertical (bottom) gaze movements for the 5-Hz condition and the 50-ms condition (in parentheses). X indicates no value available.
responses were examined (Table 1, bottom), an echo threshold can only be measured with cats 36 and 33 with the 5-Hz pulse train for diagonal speaker pairs, which agrees with the earlier finding that PEs can be observed to a certain extent in elevation when the speaker pairs are located diagonally, but not strictly vertically (i.e., in the median sagittal plane) as shown by the example in Fig. 9 (right, purple dashed line). Gaze latencies did not consistently reflect whether or not a PE was observed. When cats are experiencing auditory illusions, it is of interest to see if it takes the cats longer to respond to the paired sounds compared with the response to a single sound. Figure 10 shows examples of the gaze latencies to paired sounds (lines and symbols) and single sources (shaded areas). For cat 33, when it responded to the right hemifield (Fig. 10C), the gaze latencies to paired sounds were generally longer than the latencies to single sound sources. Note that this does not mean the right speaker had to be leading. Comparing Fig. 10 with Fig. 8, it was clear that, even though for certain left-leading conditions (small negative ISDs), the cat responded to the right, and the corresponding latencies were long. However, this trend was not observed with either of the
1279
other cats. In general, it was difficult to tell whether a PE occurred or not simply based on gaze latencies. PE was hardly observed for targets located in the left hemifield. In previous PE studies, speaker pairs were usually placed symmetrically in the frontal hemifield and rarely located off-center, i.e., both in the left or the right hemifield of the subject (Lee et al. 2009). Figure 11A shows the final gaze positions (open symbols) for single (left column) and paired (other columns) sound sources located to the left of the cat (⫺120°,0) and (⫺60°,0) using the 5-Hz pulse train. First, large scatter of responses was observed even for single sound when the targets moved to the back hemifield (⫺120°,0) (Fig. 11A, left, blue) compared with the condition of restricting the targets in the frontal hemifield (Fig. 4, shaded areas). When paired sound sources were presented with ISD ⫽ 0, both cats showed gaze responses near the 60° target (Fig. 11A, second column), rather than in between the speaker pairs. When ISD was gradually increased so that the back target was leading, the response of cat 38 gradually shifted to the back (row on top), whereas the response of cat 39 remained in the frontal hemifield (row on bottom). Overall, distinct PE-like behaviors were not observed when both speakers were located in the left hemifield. The MSO model predicted partial PE-like behaviors. The 5-Hz noise-burst train was used as the sound stimulus to the MSO model. Figure 12 plots the firing rate of the MSO model as a function of the CFs of the ipsilateral (y-axis) and contralateral (x-axis) input for single (A) and paired (B) sound sources in azimuth. Here we were simulating MSO neurons located on the left side of the brain. When a single sound was presented to the MSO model with no ITD, model cells that received input from AN model fibers with identical CFs across the two sides generally showed higher firing rates than models with mismatched input CFs (Fig. 12A, left, along the diagonal; dotted line). This behavior is expected because those neurons with matched CFs received coincident input from the two sides when there was no interaural delay. When an ITD of ⫺0.5 ms
Fig. 10. Gaze latencies as a function of ISD for paired sound sources varying in diagonal, D (⫾20°,⫾20°). The sound was either a single 50-ms noise (red) or a 10-ms noise repeated 5 times during 810 ms (blue). A positive ISD corresponds to right-source leading, i.e., when speaker 1 (50°,0°) was leading. The shaded areas are control latencies (mean ⫾ SD) to single-sourced sound. J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
1280
PRECEDENCE EFFECT IN CATS
Fig. 11. Final gaze positions in the left hemifield. A: scatter plot of the raw data for single (column on left) and paired (other columns) sound sources. The two speakers were located at (⫺120°,0) and (⫺60°,0). Negative ISDs indicated that the speaker in the back hemifield, (⫺120°,0), was leading. B: average gaze responses for paired sources (black lines). The error bars are SDs. The shaded areas represent means ⫾ SD obtained with the single sound sources.
(i.e., ipsilateral leading) was added, neurons with matched CFs no longer responded actively (Fig. 12A, middle, dotted line). Instead, neurons of which the ipsilateral CF was lower than the contralateral CF fired the most (Fig. 12A, middle, below the dotted line). This is reasonable because the traveling delays for low frequencies are naturally longer than the delays for higher frequencies. Also, as already shown in Fig. 2B, for ITD ⫽ ⫺0.5 ms, neurons with lower contralateral CFs fired the most (Fig. 12A, right, above the dotted line). For comparison, we plotted the pair of AN fiber CFs (522 and 617 Hz) corresponding to a 450-s across-fiber delay obtained by Joris et al. (2006; their Fig. 3) on the panel showing our simulation result for ITD ⫽ 0.5 ms (Fig. 12A, middle, white asterisks). Tollin and Koka (2009) showed that ITDs of this magnitude can occur in cats for sources at 90° in azimuth. The population of simulated MSO neurons with the highest firing rates (i.e., CFs in the neighborhood of 500 Hz) had mismatched CFs close to values presented by Joris et al. (2006). Figure 12B shows model responses to paired sounds with different ISDs. Each paired sound contained one sound with ITD ⫽ ⫺0.5 ms to mimic sound coming from the ipsilateral side of the simulated MSO neuron, and another sound with ITD ⫽ ⫹0.5 ms to mimic the contralateral input. When the ipsilateral and the contralateral sounds had no ISD relative to each other, the population MSO model’s response (Fig. 12B, top left) was similar to the model response to a single sound
with no interaural delay (Fig. 12A, left), that is, the model showed no preference to either the ipsilateral or the contralateral side, although individual neurons’ responses were not identical across the two conditions. In summary, the system responded as if perceiving a fused phantom sound with ITD ⫽ 0 (i.e., summing localization). When an ISD ⫽ ⫺0.3 ms was added to the two sounds (i.e., the ipsilateral/negative ITD sound was leading), responses of most of the neurons (Fig. 12B, top middle, CFs between 400 and 1,000 Hz) showed directional preference to the ipsilateral sound (Fig. 12A, middle), whereas other neurons’ responses did not change (e.g., CFs below 400 Hz). Therefore, at this ISD the model’s behavior shifted toward the ipsilateral preference, but not as much as it did for the single sound with ITD ⫽ ⫺0.5 ms. This behavior also agreed with the summing localization, which typically had larger variance than for single sources. When ISD was further increased to ⫺0.5 ms, a clearer shift to the ipsilateral side was observed, which agreed with localization dominance (Fig. 12B, top right). For even longer ISDs (e.g., ⫺1 or ⫺3 ms), a second group of neurons corresponding to contralateral leading also began responding actively (Fig. 12B, row on bottom, above the dotted line). In other words, the model was showing responses to two separable sounds in a manner agreeable with behavioral echo detection, although it occurred at an ISD that was too short to predict the human or
J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
PRECEDENCE EFFECT IN CATS
1281
Fig. 12. MSO model responses to single (A) and paired (B) noise-burst (5-Hz, 10-ms) trains. The x- and y-axes are the CFs of the contralateral and ipsilateral AN model fibers, respectively. The color represents the firing rate. For comparison, the pair of CFs (522 and 617 Hz) corresponding to a 450-s ITD obtained from Joris et al. (2006; their Fig. 3) was marked by white asterisks in A, middle, and B, bottom. The model was simulating responses of MSO neurons located on the left side of the brain.
cat behavior. Note that the stimulus was the same broadbandnoise burst train used in the behavioral experiment. Naturally different frequencies of a noise may have different delays. With mismatched input CFs, the model neurons showed firingrate patterns that were somewhat noisy (Fig. 12). Nevertheless, we observed systematic shifts of the firing rates with increasing ISDs. DISCUSSION
PE can be observed with free-moving heads. The PE has been long observed with humans (e.g., Wallach et al. 1949; Blauert 1971; Zurek 1980; Litovsky et al. 1997; Dizon and Litovsky 2004; Agaeva 2011) and animals (cats: Cranford
1982; Tollin and Yin 2003a, 2003b; Tollin et al. 2010; Dent et al. 2009; flies: Lee et al. 2009; gerbils: Wolf et al. 2010; owls: Spitzer et al. 2003; birds: Dent and Dooling 2003, 2004), under both free-field and headphone conditions. In real-world situations, because the head can move while perceiving a sound from its direct and reflected locations, any useful echo-suppression mechanism should, theoretically, be resistant to head movements. The above-mentioned human, cat, and bird studies were typically conducted when the subjects could not move their heads. The study in cats by Cranford (1982) restrained the neck of the cat; it was unclear how much head movement was allowed with their apparatus. The cat study by Dent et al. (2009) compared head-free and head-restrained performance for a wide range of target separations. Their result was con-
J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
1282
PRECEDENCE EFFECT IN CATS
sistent with our previous study using the same search-coil technique (Tollin et al. 2005) in demonstrating that restraining the head of the cat can result in underestimation of the target locations, whether in a single-source or a paired-source test. However, the Dent et al. (2009) study did not provide illustrations for us to study the slopes of the curves within summing localization. Our previous study showed that head movements are very likely to occur before the gaze reaches its final position (Tollin et al. 2009). One of the acoustic stimuli tested here was the same 5-Hz noise-burst train as that used by the previous head-restrained study (Tollin and Yin 2003b). During the presentation of the burst train, the cats almost always turned their heads and finalized their gaze responses long before the end of the stimulus (Fig. 10), except for those trials in which no saccadic movement occurred. The present study also tested a single 50-ms noise. Although the head can start moving in ⬍50 ms relative to the beginning of stimulus presentation (Ruhland et al. 2013), presumably the head movement played a lesser role compared with the burst-train condition. Indeed, when comparing the final gaze positions between these two conditions for horizontal targets, H (⫾50°,0°), a shallower slope in the region of summing localization was observed for both cat 36 and cat 38 when the stimulus duration was 50 ms (Fig. 4, A and B). This is reasonable because, when summing localization occurred, a phantom sound was presumably perceived in between the speaker pairs and biased toward the leading source. If the head moves toward the leading sound, the pinnae will further amplify the leading sound and attenuate the lagging sound, causing a larger preference to the leading source. In other words, the localization dominance was achieved at smaller ISDs. This argument is supported by the fact that measurements obtained from the head-restrained study had a slope similar to the slope for the 50-ms nonrepeated noise after being normalized by the separation angles (Fig. 4A, light blue dashed line). A similar trend was also observed with the diagonal condition, D (⫾20°,⫿20°) (Fig. 7, A and B), but not with the other diagonal condition, D (⫾20°,⫾20°) (Fig. 8, A and B). The smaller effect of head movements for the diagonal conditions was possibly due to the fact that the speakers were not as far apart in the diagonal conditions as they were in the horizontal condition. In summary, our data show that all aspects of the PE are seen in cats listening with their heads unrestrained. PE does not occur uniformly in the space. Very few studies have examined the PE in elevation (humans: Litovsky et al. 1997; cats: Dizon and Litovsky 2004; Tollin and Yin 2003a, 2003b; Agaeva 2011). These studies, all with immobile heads, found clear localization dominance, but no summing localization, for targets in the median sagittal plane. In contrast, the present study failed to observe clear PE-like behaviors for targets in the median sagittal plane. When the targets were located diagonally, localization dominance, but not summing localization, was observed in the vertical dimension. It was unclear whether or not the difference among studies was caused by head restraint. In addition, it is worth pointing out that the two horizontal speakers were 100° apart, whereas the two vertical speakers were only 60° or 40° apart. In general, it may not be surprising that the PE differs for the horizontal and the vertical planes considering that sound local-
ization in elevation uses a completely different set of cues, i.e., spectral cues caused by filtering properties of the head and pinnae (Gardner and Gardner 1973; Hebrank and Wright 1974). The majority of human or animal studies have placed the speakers symmetrically in the frontal hemifield. Lee et al. (2009) measured the PE in the parasitoid fly Ormia ochracea with one speaker placed in front of the fly and the other 90° on one side. They found that, when the front speaker was leading by 5 ms, the fly always walked straight to the front source. When the lateral speaker was leading by 5 ms, the fly walked to a place in between the two speakers, and increasing the ISD to 10 ms did not make the fly respond more laterally. In other words, they were unable to demonstrate localization dominance to the lateral speaker within the parameter range relevant to the PE. Similarly, the present study found that the cats always preferred the frontal speaker (60° to the left) to the back speaker (120° to the left) even when the back speaker was leading. This is possibly due to the attenuation of the back-field sound by the left pinna. This may pose a problem for the echo-suppression mechanism, because if a sound comes from the back hemifield and its echo from the front, the animal will respond to the front under our experimental setup. However, in the real word the direct sound will be louder than the echoes (Dent et al. 2005). Therefore, future studies should repeat the test in a more natural setup. Another observation that might be relevant is the finding that cortical responses in the cat show sharp discrimination of left from right but lower sensitivity to locations within each lateral hemifield (Stecker et al. 2005). They proposed an opponentprocess theory in which broadly tuned contralateral and ipsilateral populations function together to account for this observation. This inhomogeneity may explain the lack of PEs within each lateral hemifield. Echo-threshold measurements. Using the approach developed previously (Blauert 1997; Tollin and Yin 2003b), we were able to derive echo thresholds with the 5-Hz burst train, but rarely with the 50-ms nonrepeated noise. When examining the normalized response (Fig. 9 as an example), it is clear that the response did not drop significantly at large ISDs with the 50-ms noise. It was unclear whether the animal failed to detect the lagging sounds or simply wouldn’t respond to the lagging sound when the stimulus duration was long. This observed behavior agrees with the head-unrestrained study in budgerigars by Dent and Dolling (2003). In that study, the PE was inferred from the measurement of percent of correct on detecting the leading source, and the echo threshold was defined as the “lowest point of responding (lowest percent correct discrimination values).” Although based on this definition they could derive echo thresholds for noise bursts of various durations, i.e., 0.1, 1, and 50 ms, the lowest point as a decrease of percent of correct was only a small dip for the 50-ms noise compared with the decreases for the shorter stimuli (their Fig. 3). This agrees with our finding that the normalized gaze response for the 50 ms rarely dropped below 0.5 which we defined as threshold crossing (Fig. 9; Table 1). On the other hand, previous studies have shown a buildup effect with repeated sound pairs having a fixed ISD, that is, the echo threshold increases with time when the same sound pairs are repeated (e.g., Clifton 1987; Tolnai et al. 2014). Here the opposite was observed; the echo threshold for the 5-Hz re-
J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
PRECEDENCE EFFECT IN CATS
peated noise was lower than any measurable threshold for the 50-ms nonrepeated noise. However, because the average gaze latencies were around 200 ms or below (Fig. 10), on most of the trials the cats had made their responses only after hearing one or two repetitions of the sound bursts, which was possibly not enough for them to exhibit any buildup effect. PE is likely to occur in both the MSO and the IC. It has been shown that ITD-sensitive neurons in the IC respond to paired sound sources in a matter consistent with the observed PE-like behaviors in humans and cats (Yin 1994; Fitzpatrick et al. 1995; Litovsky and Yin 1998a, 1998b; Spitzer et al. 2004; Tollin et al. 2004; Dent et al. 2005). These studies examined individual trials that showed interactions between neural responses to the leading sound coming from one side and the responses to the lagging sound coming from the other side, as well as the overall ISD function or the ITD tuning curve. Under this concept, localization dominance is reflected as a reliable response to the leading sound and minimal response to the lagging sound; summing localization can be demonstrated as a backward effect of the lagging sound on the response to the leading sound; perception of echoes correlates to the case when two separate responses occur. All three types of responses have been observed in the IC. Xia et al. (2010) tested an MSO model (Brughera et al. 1996) as well as an inferior colliculus model (Cai et al. 1998) to explain the echo threshold (⬃8⫺10 ms for humans and cats). By studying the ITD tuning, they found that the echo threshold predicted by the MSO model (no longer than 5 ms) was too short, whereas IC model responses were able to explain the human threshold. Meanwhile, Tollin et al. (2010) showed that the pinna movements also agreed with the eye movements when cats were experiencing the PE. Because the pinna movements have very short latencies (i.e., on the order of 30 ms), it is likely that behaviors associated with the PE occur subcortically. In addition, Wang et al. (2014a) showed that enhancing GABA inhibition in the IC can introduce greater suppression to the lagging sound source. Taken together, delayed inhibition in the IC has been proposed as a major neural mechanism for the PE. However, it should be pointed out that unilateral ablation of the neocortex has been shown to significantly affect the PE in the cat, especially for long ISDs (Cranford and Oberholtzer 1976). On the other hand, physiological, behavioral, and modeling studies (Tollin 1998; Tollin and Henning 1998; Hartung and Trahiotis 2001; Trahiotis and Hartung 2002) have suggested that some of the PE can be explained at the auditory periphery through compression and adaptation. The AN model (Zilany et al. 2009) used in our simulations has been shown to accurately replicate the peripheral compression and adaptation. It also extends the PE to longer ISDs compared with predictions with earlier versions of the AN model, but the accounted effect is still inadequate compared with what is predicted by central mechanisms (Brown 2012; Brown et al. 2015). More importantly, the biophysical MSO model used here contains a low-threshold potassium current, which has been thought to act as a delayed rectifier (Svirskis et al. 2004). This potassium current will have similar suppressive effect on delayed input although not as strong as the delayed inhibition received by IC neurons. Here we used a novel simulation approach to clearly demonstrate the summing localization, localization domi-
1283
nance, and echo perceptions in a more intuitive way. Each MSO model cell received binaural input with various combinations of CFs to introduce interaural delays. Our model simulation showed a single fused “image” (i.e., population response) for small ISDs lying in between responses to single sound sources, corresponding to the summing localization. As the ISD increased, the fused image resembled population response to the single source when it was presented alone. At |ISD| ⬎ 0.5 ms, two separate images were observed, each corresponding to the two sound sources. In short, this simulation approach can be more directly compared with PE-like behaviors. In agreement with the IC modeling study (Xia et al. 2010), the predicted echo threshold at the level of the MSO was shorter than the prediction at the IC. Nevertheless, our simulation did demonstrate that certain precedence behaviors have already occurred before the information reaches the IC. This model structure is not completely physiological because MSO neurons have not been found with input showing vastly different CFs. Nevertheless, the most interesting model cells were the ones that did not have big differences in the input CFs (Fig. 12, cells close to the diagonals). Note that neurons that are effective in encoding the PE do not necessarily need to have the amount of mismatch in their input CFs required to generate the highest firing rates, such as those observed in Fig. 12B, bottom left. Even neurons with smaller-frequency mismatches can better demonstrate the PE than neurons with identical CFs. Indeed, recent physiological recordings in the MSO have revealed evidence for small mismatches (0.1 octaves) in the input CFs (Day and Semple 2011, their Table 1). Our simulation result indicates that MSO neurons with a delayed rectifier, i.e., the low-threshold potassium current, and binaurally mismatched CFs can exhibit certain PE-like behaviors. ACKNOWLEDGMENTS We are pleased to acknowledge the assistance of Jane Sekulski in computer programming.
GRANTS This work was supported by National Institute on Deafness and Other Communication Disorders Grant DC-07177.
DISCLOSURES No conflicts of interest, financial or otherwise, are declared by the authors.
AUTHOR CONTRIBUTIONS Author contributions: Y.G., J.L.R., and T.C.T.Y. conception and design of research; Y.G. and J.L.R. performed experiments; Y.G. and J.L.R. analyzed data; Y.G., J.L.R., and T.C.T.Y. interpreted results of experiments; Y.G. prepared figures; Y.G. and J.L.R. drafted manuscript; Y.G., J.L.R., and T.C.T.Y. edited and revised manuscript; Y.G., J.L.R., and T.C.T.Y. approved final version of manuscript.
REFERENCES Agaeva MI. Precedence effect at horizontal and vertical plane and moving lagging signal. Fiziol Cheloveka 37: 35– 40, 2011.
J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
1284
PRECEDENCE EFFECT IN CATS
Benichoux V, Fontaine B, Franken TP, Karino S, Joris PX, Brette R. Neural tuning matches frequency-dependent time differences between the ears. Elife 4: 2015. Blauert J. Localization and the law of the first wavefront in the median plane. J Acoust Soc Am 50: 466 – 470, 1971. Blauert J. Spatial Hearing: The Psychophysics of Human Sound Localization. Cambridge, MA: MIT, 1997. Braasch J. A precedence effect model to simulate localization dominance using an adaptive, stimulus parameter-based inhibition process. J Acoust Soc Am 134: 420 – 435, 2013. Brown AD. Temporal Weighting of Binaural Cues for Sound Localization (PhD dissertation). Seattle, WA: Univ of Washington, 2012. Brown AD, Stecker GC, Tollin DJ. The precedence effect in sound localization. J Assoc Res Otolaryngol 16: 1–28, 2015. Brughera AR, Stutman E, Carney LH, Colburn HS. A model with excitation and inhibition for cells in the medial superior olive. Aud Neurosci 2: 219 –233, 1996. Cai H, Carney LH, Colburn HS. A model for binaural response properties of inferior colliculus neurons. I. A model with interaural time differencesensitive excitatory and inhibitory inputs. J Acoust Soc Am 103: 475– 493, 1998. Clifton RK. Breakdown of echo suppression in the precedence effect. J Acoust Soc Am 82: 1834 –1835, 1987. Colburn HS, Chung Y, Zhou Y, Brughera A. Models of brainstem responses to bilateral electrical stimulation. J Assoc Res Otolaryngol 10: 91–110, 2009. Cranford JL, Oberholtzer M. Role of neocortex in binaural hearing in the cat. II. The “precedence effect” in sound localization. Brain Res 111: 225–239, 1976. Cranford JL. Localization of paired sound sources in cats: effects of variable arrival times. J Acoust Soc Am 72: 1309 –1311, 1982. Day ML, Semple MN. Frequency-dependent interaural delays in the medial superior olive: implications for interaural cochlear delays. J Neurophysiol 106: 1985–1999, 2011. Dent ML, Dooling RJ. Investigations of the precedence effect in budgerigars: effects of stimulus type, intensity, duration, and location. J Acoust Soc Am 113: 2146 –2158, 2003. Dent ML, Dooling RJ. The precedence effect in three species of birds (Melopsittacus undulatus, Serinus canaria, and Taeniopygia guttata). J Comp Psychol 118: 325–331, 2004. Dent ML, Tollin DJ, Yin TCT. Psychophysical and physiological studies of the precedence effect in cats. Acta Acust Unit Acustica 91: 463– 470, 2005. Dent ML, Tollin DJ, Yin TC. Influence of sound source location on the behavior and physiology of the precedence effect in cats. J Neurophysiol 102: 724 –734, 2009. Dizon RM, Litovsky RY. Localization dominance in the median-sagittal plane: effect of stimulus duration. J Acoust Soc Am 115: 3142–3155, 2004. Fitzpatrick DC, Kuwada S, Batra R, Trahiotis C. Neural responses to simple simulated echoes in the auditory brain stem of the unanesthetized rabbit. J Neurophysiol 74: 2469 –2486, 1995. Gai Y, Doiron B, Kotak VC, Rinzel J. Noise-gated encoding of slow inputs by auditory brain stem neurons with a low-threshold K⫹ current. J Neurophysiol 102: 3447–3460, 2009. Gai Y, Ruhland J, Yin TC, Tollin D. Behavioral and Modeling Studies of Sound Localization in Cats: Effects of Stimulus Level and Duration. J Neurophysiol 110: 607– 620, 2013. Gardner MB, Gardner RS. Problem of localization in the median plane: effect of pinnae cavity occlusion. J Acoust Soc Am 53: 400 – 408, 1973. Grothe B, Pecka M, McAlpine D. Mechanisms of sound localization in mammals. Physiol Rev 90: 983–1012, 2010. Hartmann WM, Rakerd B. Auditory spectral discrimination and the localization of clicks in the sagittal plane. J Acoust Soc Am 94: 2083–2092, 1993. Hartung K, Trahiotis C. Peripheral auditory processing and investigations of the “precedence effect” which utilize successive transient stimuli. J Acoust Soc Am 110: 1505–1513, 2001. Hebrank J, Wright D. Spectral cues used in the localization of sound sources on the median plane. J Acoust Soc Am 56: 1829 –1834, 1974. Jeffress LA. A place theory of sound localisation. J Comp Physiol Psychol 41: 35–39, 1948. Jercog PE, Svirskis G, Kotak VC, Sanes DH, Rinzel J. Asymmetric excitatory synaptic dynamics underlie interaural time difference processing in the auditory system. PloS Biol 8: e1000406, 2010. Joris PX, Van de Sande B, Louage DH, van der Heijden M. Binaural and cochlear disparities. Proc Natl Acad Sci USA 103: 12917–12922, 2006.
Lee N, Elias DO, Mason AC. A precedence effect resolves phantom sound source illusions in the parasitoid fly Ormia ochracea. Proc Natl Acad Sci USA 106: 6357– 6362, 2009. Litovsky RY, Rakerd B, Yin TC, Hartmann WM. Psychophysical and physiological evidence for a precedence effect in the median sagittal plane. J Neurophysiol 77: 2223–2226, 1997. Litovsky RY, Yin TC. Physiological studies of the precedence effect in the inferior colliculus of the cat. I. Correlates of psychophysics. J Neurophysiol 80: 1285–1301, 1998a. Litovsky RY, Yin TC. Physiological studies of the precedence effect in the inferior colliculus of the cat. II. Neural mechanisms. J Neurophysiol 80: 1302–1316, 1998b. Mickey BJ, Middlebrooks JC. Responses of auditory cortical neurons to pairs of sounds: correlates of fusion and localization. J Neurophysiol 86: 1333–1350, 2001. Macpherson EA, Middlebrooks JC. Localization of brief sounds: effects of level and background noise. J Acoust Soc Am 108: 1834 –1849, 2000. Populin LC, Yin TC. Behavioral studies of sound localization in the cat. J Neurosci 18: 2147–2160, 1998. Rothman JS, Manis PB. The roles potassium currents play in regulating the electrical activity of ventral cochlear nucleus neurons. J Neurophysiol 89: 3097–3113, 2003. Ruhland JL, Yin TC, Tollin DJ. Gaze shifts to auditory and visual stimuli in cats. J Assoc Res Otolaryngol 14: 731–755, 2013. Schwartz O, Harris JG, Principe JC. Modeling the precedence effect for speech using the gamma filter. Neural Netw 12: 409 – 417, 1999. Spitzer MW, Bala AD, Takahashi TT. Auditory spatial discrimination by barn owls in simulated echoic conditions. J Acoust Soc Am 113: 1631– 1645, 2003. Spitzer MW, Bala AD, Takahashi TT. A neuronal correlate of the precedence effect is associated with spatial selectivity in the barn owl’s auditory midbrain. J Neurophysiol 92: 2051–2070, 2004. Stecker GC, Hafter ER. Temporal weighting in sound localization. J Acoust Soc Am 112: 1046 –1057, 2002. Stecker GC, Harrington IA, Middlebrooks JC. Location coding by opponent neural populations in the auditory cortex. PLoS Biol 3: e78, 2005. Svirskis G, Kotak V, Sanes DH, Rinzel J. Sodium along with low-threshold potassium currents enhance coincidence detection of subthreshold noisy signals in MSO neurons. J Neurophysiol 91: 2465–2473, 2004. Tollin DJ. Computational Model of the Lateralization of Clicks and Their Echoes, edited by Greenberg S and Slaney M. Proceedings of the NATO Advanced Study Institute on Computational Hearing, 1998, p. 77– 82. Tollin DJ, Henning GB. Some aspects of the lateralization of echoed sound in man. I. The classical interaural-delay based on precedence effect. J Acoust Soc Am 104: 3030 –3038, 1998. Tollin DJ, Koka K. Postnatal development of sound pressure transformations by the head and pinnae of the cat: Binaural characteristics. J Acoust Soc Am 126: 3125–3136, 2009. Tollin DJ, Yin TC. Spectral cues explain illusory elevation effects with stereo sounds in cats. J Neurophysiol 90: 525–530, 2003a. Tollin DJ, Yin TC. Psychophysical investigation of an auditory spatial illusion in cats: the precedence effect. J Neurophysiol 90: 2149 –2162, 2003b. Tollin DJ, Populin LC, Yin TC. Neural correlates of the precedence effect in the inferior colliculus of behaving cats. J Neurophysiol 92: 3286 –3297, 2004. Tollin DJ, Populin LC, Moore JM, Ruhland JL, Yin TC. Sound-localization performance in the cat: the effect of restraining the head. J Neurophysiol 93: 1223–1234, 2005. Tollin DJ, Ruhland JL, Yin TC. The vestibulo-auricular reflex. J Neurophysiol 101: 1258 –1266, 2009. Tollin DJ, Yin TC. Sound localization: neural mechanisms. In: Encyclopedia of Neuroscience, edited by Squire LR. New York, NY: Elsevier, 2009, p. 137–144. Tollin DJ, McClaine EM, Yin TC. Short-latency, goal-directed movements of the pinnae to sounds that produce auditory spatial illusions. J Neurophysiol 103: 446 – 457, 2010. Tolnai S, Litovsky RY, King AJ. The precedence effect and its buildup and breakdown in ferrets and humans. J Acoust Soc Am 135: 1406 –1418, 2014. Trahiotis C, Hartung K. Peripheral auditory processing, the precedence effect and responses of single units in the inferior colliculus. Hear Res 168: 55–59, 2002. Wallach H, Newman EB, Rosenzweig MR. The precedence effect in sound localization. Am J Psychol 62: 315–336, 1949.
J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org
PRECEDENCE EFFECT IN CATS Wang Y, Wang N, Wang D, Jia J, Liu J, Xie Y, Wen X, Li X. Local inhibition of GABA affects precedence effect in the inferior colliculus. Neural Regen Res 9: 420 – 429, 2014a. Wang L, Devore S, Delgutte B, Colburn HS. Dual sensitivity of inferior colliculus neurons to ITD in the envelopes of high-frequency sounds: experimental and modeling study. J Neurophysiol 111: 164 –181, 2014b. Wolf M, Schuchmann M, Wiegrebe L. Localization dominance and the effect of frequency in the Mongolian Gerbil, Meriones unguiculatus. J Comp Physiol A Neuroethol Sens Neural Behav Physio 196: 463– 470, 2010. Xia J, Brughera A, Colburn HS, Shinn-Cunningham B. Physiological and psychophysical modeling of the precedence effect. J Assoc Res Otolaryngol 11: 495–513, 2010.
1285
Yin TC. Physiological correlates of the precedence effect and summing localization in the inferior colliculus of the cat. J Neurosci 14: 5170 –5186, 1994. Yin TC. Neural mechanisms of encoding binaural localization cues in the auditory brainstem. In: Integrative Functions in the Mammalian Auditory Pathway, edited by Oertel D, Fay RR, Popper AN. New York, NY: Springer, 2002, p. 99 –159. Zilany MS, Bruce IC, Nelson PC, Carney LH. A phenomenological model of the synapse between the inner hair cell and auditory nerve: long-term adaptation with power-law dynamics. J Acoust Soc Am 126: 2390 –2412, 2009. Zurek PM. The precedence effect and its possible role in the avoidance of interaural ambiguities. J Acoust Soc Am 67: 953–964, 1980.
J Neurophysiol • doi:10.1152/jn.00214.2015 • www.jn.org