| old_uid | 3945 |
|---|
| title | Audio-visual speech perception and intelligibility in noise : first results |
|---|
| start_date | 2008/01/28 |
|---|
| schedule | 11h-12h30 |
|---|
| online | no |
|---|
| summary | Jacqueline Leybaert 1 & Frédéric Berthommier 2
Normally-hearing individuals show better speech intelligibility in noise when the noise in fluctuating than when it is steady (the auditory masking release effect). Adults with a cochlear deficit do not show a masking release effect, even if they are fitted with a cochlear implant. An question unexplored up to now is how the MR effect is modified by the addition of visual information in normally-hearing and hearing-impaired individuals. This question was the starting point of our experimental project.
We designed a first experiment aimed at measuring the intelligibility of 16 French consonants in /aCa/ context, presentend in stationary noise or with noise frequency-modulated (2 Hz, 8 Hz, 32 Hz, ou 128 Hz), in three conditions : audio alone (AO), audiovisual (AV) et visual alone (VO). The signal to noise ratio was at -23 dB, leading to a performance of 20% correct responses in stationary noise. The results of a group of normally-hearing adults show a benefit related to the addition of visual information to the speech-degraded information. This benefit was larger (i) when auditory information is highly degraded (i,e, in stationary noise) ; (ii) for place of articulation and mode of articulation than for voising or nasality ; (iii) for the consonants salient from a visual point of view. These results are highly consistent with the literature about audio-visual speech perception.
In a second step, we designed a simplified version of the experiment, more adapted for children. It consisted of two sessions (one with voiced, the other with voiceless consonants), each comprising 6 phonemes corresponding to 6 visemes (eg ; b, d, g, v, z, j). We used three levels of noise : stationary noise, 128 hz and 8 Hz. We collected data on children from 2nd, 4th and 6th grade, and adults. The results show that the benefit related to the addition of visual speech appeared at each age level and did not vary with age. The presentation of McGurk stimuli (eg A/b/ coupled with a visual /g/) lead to an interesting demasking effect : in stationary noise, most of the responses were visual, while fusions appeared in fluctuating noise. The amount of integration responses did not seem to vary with age level/language experience.
In a third step, the « unvoiced » version of the experiment was administered to a group of deaf children fitted with a cochlear implant, and a group of children with specific language impairment (SLI). As expected, the children with CI did not show the auditory masking release effect. They showed a large benefit from the addition of visual cues. Finally, they did not show fusion responses to the McGurk stimuli. Children with SLI, on the other hand, showed the auditory masking effect. Some of them had difficulties in speechreading, and, relatedly, they did not experience fusion percepts in response to McGurk stimuli.
The discussion of all these data will be centered about the notion of audio-visual balance, which seems necessary to experience audio-visual integration.
1 Université libre de Bruxelles (U.L.B.), LAPSE, 50, avenue F,D, Roosevelt, 1050 Brussels, Belgique
2 Institut de la Communication Parlée, Gipsa-Lab, UPRESA CNRS 5009, INPG, 46, avenue Viallet, 38031 Grenoble, France |
|---|
| responsibles | Information non disponible |
|---|
| |