P
ec te
n st er
Po
0
10
.F
p
ht
yr ig
k
Ba n
st er
Po
te d
ec
00
0
ot pr
F1
1) Stevenson, Altieri, Kim, Pisoni, & James (Under Review). Neural processing of asynchronous audiovisual speech perception. Neuroimage. 2) Miller & D’Espisito (2005). Perceptual fusion and stimulus coincidence in the cross-modal integration of speech. Journal of Neuroscience, 25. 3) McGurk & MacDonald (1976). Hearing lips and seeing voices. Nat, 264. 4) Calvert, Campbell, & Brammer (2000). Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Cur Bio, 10. 5) Conrey & Pisoni (2006). Auditory-visual speech perception and synchrony detection for speech and nonspeech signals. J Aco Soc Am, 119. 6) Sheffert, Lachs, & Hernandez (1996). The Hoosier Mulitalker Database. (No. 21). Bloomington, IN: Speech Research Laboratory, Indiana University. Ed., Pisoni, D. B. 7) Stevenson, Kim, & James (2009). An additive-factors design to disambiguate neuronal and areal convergence: Measuring multisensory interactions between audio, visual, and haptic sensory streams using fMRI. Exp Br Res, 198. 8) Stevenson & James (2009). Audiovisual integration in human temporal sulcus: Inverse effectiveness and the neural processing of speech and object recognition. Neuroimage, 44. 9) Talaraich & Tournoux (1988). Co-planar stereotaxic atlas of the human brain. Theime Medical Publishers, New York. 10) Ito, Stuphorn, Brown, & Schall (2003). Performance monitoring by the anterior cingulate cortex during saccade countermanding. Sci, 302. 11) Brown & Braver (2005) Learned predictions of error likelihood in the anterior cingulate cortex. Sci, 307. 12) Carter et. al (1998). Anterior cingulate cortex, error detection, and the online monitoring of performance. Sci, 280.
ht
d.
ig
te
yr
ot ec
op
pr
.C
ht
an k
ig
er B
yr
op
st
k. an rB
(Synchronous - Incongruent) - (Synchronous - Asynchronous)
C
Po
00
F1 0
C op
10
.F
ig ht
op yr
.C
an k
er B
st
Po
0 00
F1
d.
te
an
00
ot pr
k.
Ba n
st er
Po
te c
rB
1) Multisensory networks were identified that responded to stimulus synchrony, perceptual fusion, and semantic incongruency of audiovisual speech stimuli. 2) When temporal synchrony was accounted for, these effects were reduced, particularly those effects in sensory regions. 3) The ability of temporal synchrony to account for much of the effects seen with perceptual fusion and semantic congruency suggest these effects may be a by-product of temporal asynchrony.
References
- Both of these networks included regions previously shown to be involved in multisensory integration and/or speech processing. To identify region’s with BOLD responses modulated by semantic congruency while controlling for synchrony effects, an interaction contrast was used:
ro
Po st e
Conclusions
p < .00005
p < 0.005
(Fused - Unfused) - (Synchronous - Asynchronous)
gh tp
k.
0 00 -70
yr op C
-92
d.
ht ig
-59
F1
an rB
.F 10 00
ed
ht
te r
Po s
te d
ec pr ot
k.
This null finding suggests that at least some of the commonly found effects seen with semantic congruency may be attributed to temporal synchrony as opposed to semantic congruency.
- Both of these networks included regions previously shown to be involved in multisensory integration and/or speech processing. To identify region’s with BOLD responses modulated by perceptual fusion while controlling for synchrony effects, an interaction contrast was used:
ct
op yr ig
k. C
Ba n
10
.F
ig h
The majority of semantic congruency effects seen in BOLD activation were eliminated when temporal synchrony was accounted for, with the exception of right motor cortex.
Temporal Synchrony sensitive network: Synchronous > Asynchronous
te
pr o
00
k. an
rB Po st e
00
ro
tp
Here, we measured semantic congruency effects while controlling for concurrent synchrony effects.
Perceptual fusion sensitive network: Fused > Unfused
ht
pr ot ec te d
C op
Po s
0 00 F1 d.
ct e
Superior Colliculus
yr
C op
-69
Perceptual fusion and Temporal Synchrony:
rig
10
te d
pr ot ec
ht
yr ig
te r
d.
te te
-26
Semantic congruency is also known to be an important factor modulating audiovisual integration. Most semantically incongruent presentations of audiovisual speech include a speaker’s lip movements that to not temporally coincide with the auditory speech signal, and thus contain temporally asynchrony.
Temporal Synchrony sensitive network: Synchronous > Asynchronous
op y
12
Discussion: Semantic Congruency
Semantic congruency sensitive network: Synchronous > Incongruent
.C
.F
ht
yr ig
op
k. C Ba n
00 F1
-10
-25
p < 0.00005
Ba
.F
k.
rB an
st e Po
0
pr ot ec te d
Po s
0 pr ot ec
ht yr ig 00
F1
ed .
ct
ot e
pr
nk
Insula
Superior Temporal Cortex
11
These effects were driven primarily by the response to the unfused condition, which was the only condition with which a significant response was seen. Lateral Occipital Cortex
Semantic Congruency and Temporal Synchrony:
Note: Synchronous unfused, asynchronous fused, and incongruent fused trials were excluded from further analysis.
Ba
Results showed a network of brain regions including middle and dorsolateral frontal regions as well as anterior cingulate cortex. This network is theorized to be involved in error detection , error likelihood , and response conflict .
Preprocessing: 3-D motion correction, linear trend removal, spatial smoothing 6 mm FWHM, Talaraich spatial normalization.
ht
ig
yr
op
C
k.
Perceptual fusion, another effect seen with audiovisual speech integration, is often confounded with temporal synchrony effects.
10
14
-4
Results: fMRI
0
te
pr o
rB an
te
Audiovisual multisensory integration is modulated by the temporal synchrony of the auditory and visual components of a stimulus presentation.
-92 -70 Inferior Occipital Gyrus
Primary Motor Cortex
10
bloomington
Here, we measured perceptual fusion effects in BOLD fMRI while controlling for concurrent synchrony effects.
Primary Motor Cortex
te Po s
p < .00005
ht
ig
yr
op
Inferior Frontal Gyrus
-66
d. ct e
00
d. ec te
0
00 F1
-7
p < 0.005
C
Po s
28
C op
k. an
rB
Po st e
Premotor Cortex
50% threshold trials were split into fused and unfused conditions based on subject’s perception.
0
39
Dorsolateral Prefrontal Cortex
Scanning Procedures: 10 fast event-related runs with variable ISI (2-6 s), with 7 trials of each condition per run, for a total of 70 trials per condition, per subject. All orders were counterbalanced.
F1 00
Fusiform Gyrus
38
Conditions (5 total) Synchronous - 0 ms offset, perceptually fused Asynchronous - 400 ms offset, perceptually unfused Fused - 50% threshold, perceptually fused Unfused - 50% threshold, perceptually unfused Incongruent - 0 ms offset, perceptually unfused Auditory and Visual presentations were semantically incongruent
nk .
Ba te r Po s 0 00
C op yr ig ht
F1
ht
yr ig
k. an rB
Po st e 0
00
F1 d. ct e te
pr o
ht
yr ig .C op Ba nk F1
te Temporal Offset (ms)
-59
C. (Synchronous-incongruent) - (Asynchronous-congruent)
8
0 ms offset = 89% fusion rate (SD = 8%) 400 ms offset = 7% fusion rate (SD = 8%) 50% threshold = 54% (SD = 15%) Incongruent = 4% fusion rate (SD = 4%)
300
Medial Frontal Gyrus
Supplementary Motor Cortex
Results: Behavioral
d.
Percent Fused
ot ec
pr
C op
Po 0
00
d.
te
ec
ot
pr
ht
ig yr te r Po s
00
2
0
Superior Colliculus
Behavioral Task: 2AFC fused-unfused
Insula
Premotor Cortex
-69
Stimuli: Stimuli were identical to the behavioral prescan.
10
ht
0
-26
Middle Frontal Gyrus
-22
47
Lateral Occipital Cortex
Anterior Cingulate Cortex
4
Middle Occipital Gyrus Lateral Occipital Cortex
-4
Anterior Cingulate Cortex
Insula
Superior Temporal Cortex
17
Methods: fMRI
Dorsolateral Prefrontal Cortex
indiana university
Discussion: Perceptual Fusion
B. Synchronous-congruent > Asynchronous-congruent
-10
23
Dorsolateral Prefrontal Cortex
1
.F
ct ed
ot e
pr
ht
rig
Mean 50% threshold = 167 ms offset.
Inferior Frontal Gyrus
10
-63
14
35
p < 0.00005
100
50
Primary Motor Cortex
C. (Fused - Unfused) - (Synchronous - Asynchronous)
op
C
nk .
Ba
er st Po 0 00
F1
d. 50% threshold: the offset at which 50% of trials were fused and 50% unfused was calculated for each individual for use in the fMRI.
pr
k. er
st
Premotor Cortex
28
38
-16
Hippocampus
Dorsolateral Prefrontal Cortex
F1
ht
yr ig
C op
k.
Ba n
er
(as used in 1,7,8)
Behavioral Prescan: 11 participants completed a prescan behavioral experiment measuring fusion rates of A-V stimulus asynchronies ranging from 0 ms offsets to 300 ms offsets in 33.33 ms intervals.
Premotor Cortex
50
Stimuli: 10 single-word utterances made by a single female speaker from the Hoosier Audiovisual Multi-Talker Database . Words were: beach, dirt, rock, soil, rain, teeth, . neck, face, back, leg 6
-7
Ba n
00
te
ec
pr
1) Describe the cortical networks that are modulated by changes in temporal synchrony, perceptual fusion, and semantic congruency during the integration of multisensory audiovisual speech. 2) Identify regions, or networks of regions, that are modulated by perceptual fusion beyond what is seen with temporal asynchrony. 3) Identify regions, or networks of regions, that are modulated by semantic congruency beyond what is seen with temporal asynchrony.
Methods: Behavioral
4
39
Medial Frontal Gyrus
Premotor Cortex Primary Motor Cortex Precuneus
ot
C
Po
Medial Frontal Gyrus
47
ot
Research Goals
yr
op
st
17
B. Synchronous > Asynchronous
d.
F1
4
Extrastriate Visual Cortex
ig
26
0
2
Insula
ht
50
Dorsolateral Prefrontal Cortex
5
3
A. Synchronous-congruent > Synchronous-Incongruent
F1
k.
er
3,4
3
Middle Frontal Gyrus
Ba n
1,2
C
A. Fused > Unfused
1) A listener’s ability to integrate audiovisual speech can be modulated by varying the temporal synchrony or the semantic congruency of the auditory and visual speech components. 2) As auditory and visual speech become more temporally synchronous, the perceptual fusion rate increases . This correlation makes it difficult to separate effects of multisensory perceptual fusion from temporal synchrony effects . 3) Semantic congruency effects seen with audiovisual speech are also often confounded by temporal synchrony . When a listener is presented with different auditory and visual utterances, the waveform of the auditory speech does not temporally match the speaker’s lip movements, and is thus also temporally asynchronous.
1,2,3
Department of Psychological and Brain Sciences, Indiana University; Program in Neuroscience, Indiana University; Cognitive Science Program, Indiana University
op
Background
2
pr ot ec te
yr
PAN LAB
2,3
d.
ig
1
1
10 00
o
te rB an k. C
Po
00 0
ht pr ot ec t
Ryan A. Stevenson , Ross VanDerKlok , Sunah Kim , & Thomas W. James 1,2
pr ot ec
Different neural networks underlie temporal asynchrony and semantic congruency effects with audiovisual speech
Contact:
[email protected]