Relating the speech-derived frequency-following response to speech intelligibility in noise
The envelope following response (EFR) is a brainstem auditory evoked potential (AEP) to modulated sound and its strength was shown to relate to the individual ability to detect amplitude-modulation perceptually. Recent work from our research group has shown that the EFR to a modulated square-wave stimulus is sensitive to age-related temporal-envelope processing deficits associated with cochlear synaptopathy This raised our interest in using AEPs as an objective tool to detect problems in speech perception.
Here, we study whether the AEP extracted from a speech token can also predict individual speech recognition performance. Brainstem AEPs contains phase-locked information to the harmonics of the sound, including the fundamental frequency (F0) of the speaker. Because simulations using an auditory model of the auditory periphery suggest that phase-locking to F0 is reduced with cochlear synaptopathy, we hypothesized that the phase-locked brainstem response to the harmonic content of the presented speech token might relate to speech intelligibility. We further investigated whether adding stationary background noise would affect the relationship.
Speech AEPs were analyzed in a group of young normal hearing (yNH), elderly normal hearing (oNH) and elderly hearing-impaired (oHI) participants to 3000 iterations of the CV /da/ in quiet and in speech-weighted noise, spoken by a male speaker. A linear regression analysis was performed with the EFR (adding + and - polarities) and spectral FFR (subtracting + and - polarities) as predictor variables for the speech-reception threshold to broadband and filtered (<1.5 kHz; > 1.65 kHz) sentences from the German OLSA test.
Our results confirm that the amplitude of the speech-derived EFR matched the EFR to amplitude-modulated stimuli in the same listeners well. The EFR was strongest in the yNH group, followed by the oNH group and oHI group. The group ranking of the EFR explained group differences in the speech-reception threshold for high-pass filtered speech (> 1.65 kHz) in noise. Significant within group correlations were found using either the EFR or spectral FFR to predict speech performance. However, no major differences were found between the responses recorded in the quiet and noise condition. Our results suggest that the brainstem AEP to a short CV can to some extent inform about the speech reception threshold.