Listening effort of natural speaking styles
According to the hyper-hypo model of speech communication two principles govern human speech production; efficacy and economy. From the speaker's side, there is a constant negotiation between maximizing clarity and minimizing production effort which leads to natural speech adaptations from effortless to effortful and vice versa that aim to maintain intelligibility in dynamically changing listening environments. From the listener's side, speech in noise studies have shown that speaker's adaptations are beneficial for the listener, reducing listening effort.
In this work, we aim to measure objectively listeners' effort on processing natural speaking styles. Unlike other studies that vary intelligibility levels by degrading (e.g. vocoded speech) or modifying speech properties (spectral boosting, duration transformations), our main focus is on the cognitive demands of processing natural speech. Physiological responses to clear, casual and Lombard speaking styles in quiet, reverberation and cafeteria noise were measured using pupil dilation metrics. A total of 40 normal-hearing Spanish natives were tested using a combination of the above speaking styles and the masker types at a specific signal-to-noise ratio level for restaurant and reverberation estimated to induce 70% of intelligibility on the style of speech considered most effortful, namely casual speech. To evaluate the performance of pupil dilation metrics, subjective evaluations have also been collected on speech intelligibility through oral responses on the attended stimuli and on listening effort in the form of questionnaires. A mixed effects model with subjective listening effort and intelligibility scores as fixed factors and participant as random factor, revealed significant effects and interactions on peak pupil dilation. Both masker type and speaking style contributed to this significance. Interestingly, the peak pupil dilation metric follows the subjective listening effort pattern of the speaking style which may suggest that it reveals the cognitive effort of processing speech.
Last, we introduce the phonological awareness theory to explain participants' variability on listening effort and physiological responses. Phonological awareness capacity is participants' ability to manipulate and process speech sounds. It has been found that for participants with impeded phonological processing, listening effort increases significantly in noisy conditions. To estimate participants' phonological awareness, we designed a battery of phonological tasks for adults. Surprisingly enough, correlation analysis on the phonological scores and listening effort evaluations and on the phonological scores and physiological metrics did not show any significant relationships.