12th Speech in Noise Workshop, 9-10 January 2020, Toulouse, FR

Perceptual adaptation to speech variation when listening to vocoded speech: Preliminary results

Olivier Crouzet(a)
LLING UMR6310 - Université de Nantes / CNRS, France

Etienne Gaudrain(b)
CNRS, Lyon Neuroscience Research Center, France | University of Groningen, University Medical Center Groningen, Netherlands

Deniz Başkent(b)
University of Groningen, University Medical Center Groningen, Netherlands

(a) Presenting
(b) Attending

Background: When processing speech signals, listeners adapt to several sources of variation. Some are associated with 'voice' information (vocal characteristics, speaker identity, gender, emotional state...) while others relate to phonological categories (phonemic classes / distinctive features). It has been shown that variations in acoustic context or individual voices can significantly influence speech identification performance in normal-hearing listeners. Alterations in acoustic context can shift phonological boundaries (Ladefoged & Broadbent, 1957, JASA 29:98; Sjerps, McQueen & Mitterer, 2013, AP&P 75:576) while increasing variation in multi-speaker setups can hinder speech identification performance (Goldinger, 1996, JEP:LMC 22:1166). Chang & Fu (2006, JSLHR 49:1331), comparing normal-hearing (NH) listeners processing vocoded speech with cochlear-implanted (CI) listeners, showed that CI participants experienced consistent difficulties in multi-speaker conditions, whereas the effect was more variable in NH listeners, depending on vocoder parameters. Therefore, some of the speech recognition difficulties that CI listeners experience on a daily basis may have their origins in limitations for the appropriate 'separation' of 'voice' properties from 'speech' cues. As stated by Chang & Fu (2006), 'acoustic variations among talkers may be confused with acoustic differences between phonemes' as a consequence of the low spectral resolution of the CI. In order to get a better understanding of perceptual adaptation effects with vocoded-speech, a replication of the seminal work by Ladefoged & Broadbent (1957) was conceived.

Method: 15 Dutch monosyllabic word-pairs were selected (e.g. [kip] 'tilted' vs. [kIp] 'chicken'). For each vowel contrast (word pairs), an acoustic continuum was generated on the basis of the actual formant frequencies. Each of these monosyllabic words was concatenated at the end of a short fixed carrier sentence (eng. tr. 'Please say what this word is:') which was submitted to various acoustic manipulations (either changes in the F1~F2~F3 formant space along the vowel continuum dimension or changes in VTL relating to formant frequencies as well). The aim of this study was to investigate the influence that mechanisms of adaptation to changes associated with two different sources of voice variation may exert on phonological classification (specifically vowel identification) when speech signals are processed through channel vocoding.

Results : Preliminary results have been collected from an online experiment in order to estimate the size of the effects, to test statistical modelling approaches and to perform power analyses through simulations for the final data collection. These preliminary results along with simulations for power analyses will be discussed at the conference.

Last modified 2020-01-06 19:23:55