The influence of a physiologically inspired complex compression scheme on perceived listening effort for speech in noise
Objective: The primary goal of this study was to assess whether the required listening effort for speech recognition can be decreased by processing a clean speech signal with a complex compression scheme consisting of an instantaneous feed-forward and delayed feedback component mimicking the early stages of the healthy human auditory system ("Mimi-processing"). Listening effort measures were compared between processed and unprocessed sentences. A global equal-RMS constraint was imposed to avoid the influence of level-boost.
Methods: Perceived listening effort was assessed for 30 participants between the ages of 21 to 60 years old (mean = 32.5 ± 10.7 years SD) with the ACALES procedure (Krueger et al., 2017). Participants had average PTA4s of 9.4 dBHL (SD= 5.5 dBHL) in their better ear. The ACALES method employs a rating scale which is applied in an adaptive procedure to measure perceived listening effort for a wide range of (individualised) SNRs without resulting in ceiling effects. Participants rated their effort from 1=’(almost) no effort’ to 13=’Extremely effortful’ or 14=’Only Noise’. Sounds were presented binaurally via Etymotic ER-1 insert earphones. The cohort was divided into three groups for which three different noise types were assessed, namely speech-shaped noise (SSN), multi-talker babble (MTB) and Cafeteria noise.
For each condition and participant, a two-slope function was fitted to the data points and the SNR-distance between the fitted functions of two different listening conditions at equal ratings is the measure of interest. The mean SNR-distance across ratings was calculated per participant and provides a value for how much the SNR can differ between two conditions and yet provide equal average effort ratings.
Results: SNR-differences [dB] between processed and unprocessed stimuli were significantly different from zero (p<0.001) with mean Mimi-processing benefits of 2.55 dB (SSN), 2.31 dB (MTB noise) and 2.22 dB (Cafeteria noise). This means that after Mimi-processing, speech stimuli with SNRs reduced by 2.22dB - 2.55 dB (noise dependent) are rated at equal listening effort by the average listener compared to unprocessed stimuli.
Conclusions: These results indicate that the Mimi-processing algorithm can decrease the perceived listening effort for speech presented in noise. This work provides a promising foundation upon which further improvements of the processing parameters may be implemented to increase speech intelligibility in noise.