Chapter title |
Sources of Variability in Consonant Perception and Implications for Speech Perception Modeling
|
---|---|
Chapter number | 46 |
Book title |
Physiology, Psychoacoustics and Cognition in Normal and Impaired Hearing
|
Published in |
Advances in experimental medicine and biology, April 2016
|
DOI | 10.1007/978-3-319-25474-6_46 |
Pubmed ID | |
Book ISBNs |
978-3-31-925472-2, 978-3-31-925474-6
|
Authors |
Johannes Zaar, Torsten Dau |
Editors |
Pim van Dijk, Deniz Başkent, Etienne Gaudrain, Emile de Kleine, Anita Wagner, Cris Lanting |
Abstract |
The present study investigated the influence of various sources of response variability in consonant perception. A distinction was made between source-induced variability and receiver-related variability. The former refers to perceptual differences induced by differences in the speech tokens and/or the masking noise tokens; the latter describes perceptual differences caused by within- and across-listener uncertainty. Consonant-vowel combinations (CVs) were presented to normal-hearing listeners in white noise at six different signal-to-noise ratios. The obtained responses were analyzed with respect to the considered sources of variability using a measure of the perceptual distance between responses. The largest effect was found across different CVs. For stimuli of the same phonetic identity, the speech-induced variability across and within talkers and the across-listener variability were substantial and of similar magnitude. Even time-shifts in the waveforms of white masking noise produced a significant effect, which was well above the within-listener variability (the smallest effect). Two auditory-inspired models in combination with a template-matching back end were considered to predict the perceptual data. In particular, an energy-based and a modulation-based approach were compared. The suitability of the two models was evaluated with respect to the source-induced perceptual distance and in terms of consonant recognition rates and consonant confusions. Both models captured the source-induced perceptual distance remarkably well. However, the modulation-based approach showed a better agreement with the data in terms of consonant recognition and confusions. The results indicate that low-frequency modulations up to 16 Hz play a crucial role in consonant perception. |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 13 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Master | 6 | 46% |
Professor | 3 | 23% |
Student > Ph. D. Student | 3 | 23% |
Researcher | 1 | 8% |
Readers by discipline | Count | As % |
---|---|---|
Engineering | 5 | 38% |
Psychology | 2 | 15% |
Agricultural and Biological Sciences | 2 | 15% |
Neuroscience | 2 | 15% |
Medicine and Dentistry | 2 | 15% |
Other | 0 | 0% |