12th Speech in Noise Workshop, 9-10 January 2020, Toulouse, FR

The influence of a physiologically inspired complex compression scheme on speech intelligibility in noise

Saskia M. Waechter(a), Vinzenz H. Schönfelder, Sarah Voice, Nicholas R. Clark
Mimi Hearing Technologies GmbH, Research Department, Berlin, Germany

(a) Presenting

Objective: The primary goal of this study was to assess whether speech intelligibility in speech-shaped background noise can be improved by processing the clean speech signal with a complex compression scheme consisting of an instantaneous feed-forward and delayed feedback component mimicking the early stages of the healthy human auditory system ("Mimi-processing"). Speech intelligibility measures were compared between processed and unprocessed sentences. A global equal-RMS constraint was imposed to avoid the influence of level-boost.

Methods: Speech intelligibility was assessed for 35 native German speakers (24-68 years old) with an adaptive speech reception threshold (SRT) test, which provides an estimate of the required signal-to-noise ratio to achieve 50% correct word identification. Participants had average PTA4s of 9.9 dBHL (SD=7.1 dBHL). SRTs were measured with the German Oldenburg Matrix sentence test (OLSA). Sounds were presented monaurally via Etymotic ER-1 insert earphones. For a sub-cohort of 12 participants, an additional third condition was assessed in which speech was processed with an ‘equivalent-equaliser’ (equivalent-EQ). For each condition, a psychometric function was fitted to individual datasets with the psignifit toolbox, which implements a maximum-likelihood method for estimating psychometric parameters. In this way, SRTs were estimated for all participants and conditions, and the results were analysed as the difference-SRT [dB], averaged across participants, between the unprocessed condition and the respective test condition of interest.

Results: Full-cohort results: Mimi-processing resulted in statistically significantly improved SRTs [t(34)=19.78, p<0.0001] with a mean SRT-improvement of 2.77 dB compared to unprocessed speech as assessed with a two-sided one-sample t-test on paired observations. Sub-cohort results: The sub-cohort data (n=12) indicated statistically significant SRT-differences between the conditions unprocessed, Mimi-processed and equivalent-EQ [ANOVA: F(2,22)=219.73, p<0.0001]. Post-hoc analysis with multiple t-tests and Bonferroni correction for multiple comparisons revealed that SRTs of the equivalent-EQ condition were significantly worse than unprocessed SRTs (SRT=-1.31 dB) and SRTs of the Mimification condition was significantly better than unprocessed SRTs.

Conclusions: These results indicate that the Mimi-processing algorithm can improve speech intelligibility for speech presented in noise. This benefit seems to be a result of its unique compression scheme and does not solely emerge from frequency dependent energy shifting as represented by the equivalent-EQ condition. This work provides a promising foundation upon which further improvements of the processing parameters may be implemented to increase speech intelligibility in noise.

Last modified 2020-01-06 19:23:55