SpiN 2020 :: program

Open challenges for driving hearing device processing: lessons learnt from automatic speech recognition

Jon P. Barker^(a)
University of Sheffield, United Kingdom

Michael A Akeroyd^(b)
University of Nottingham, United Kingdom

Trevor Cox^(b)
University of Salford, United Kingdom

John Culling^(b)
University of Cardiff, United Kingdom

Simone Graetzer^(b)
University of Salford, United Kingdom

Graham Naylor, Eszter Porter^(b)
University of Nottingham, United Kingdom

(a) Presenting
(b) Attending

Recent advances in machine learning raise the prospect of a new generation of hearing devices able to address the speech-in-noise problem. However, the exact path to this goal remains unclear. In contrast, in the field of automatic speech recognition, new machine learning techniques are transforming speech-in-noise performance. The speed of this progress has been enabled, in part, by a research tradition of `open challenges'. This talk explains how such challenges operate to drive speech recognition research and how a similar methodology could benefit hearing device development.

To motivate the challenge methodology, the talk will first present the recent CHiME-5 [1] (and ongoing CHiME-6) speech recognition challenges. These have focused on conversational speech in a dinner party scenario using audio captured using multiple microphone array devices and in-ear binaural microphones. The talk will look at how these challenges have fostered collaboration between research groups specialising in different aspects of the problem, and how they have encouraged system components to be shared leading to further advances. Some components of these ASR solutions, which include de-reverberation, multi-channel signal processing and source separation components, may be directly relevant for hearing device processing, but they remain un-evaluated in a hearing context.

The talk will then present our new project, Clarity [2], which will deliver open challenges specifically designed for hearing aid signal processing and hearing aid speech quality/intelligibility prediction. These tasks are very different from speech recognition, but share common features that motivate a challenge-driven approach. The talk will outline our initial plans, which have also been inspired by other listener-directed challenges such as Blizzard [3], Hurricane [4], Reverb [5] and SISEC [6]. Our plans are at a very early stage and we are actively seeking input and feedback from the speech-in-noise community.

References:
[1] J. Barker, S. Watanabe, E. Vincent, J. Trmal., "The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines", Interspeech, 2018
[2] M. Akeroyd, J.Barker, T. Cox, J.Culling, S.Graetzer, P.Naylor, E.Porter, www.claritychallenge.org
[3] S. King "Measuring a decade of progress in Text-to-Speech" Loquens, Vol 1, No 1, 2014
[4] M. Cooke, C. Mayo, C. Valentini-Botinhao. "Intelligibility-enhancing speech modifications: the Hurricane Challenge". Interspeech. 2013
[5] K. Kinoshita et al; "A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research" EURASIP Journal on Advances in Signal Processing, doi:10.1186/s13634-016-0306-6, 2016
[6] Fabian Robert-Stöter, Antoine Liutkus, Nobutaka Ito. The 2018 Signal Separation Evaluation Campaign. LVA/ICA, Surrey, UK.

Last modified 2020-01-06 19:23:55