A novel Speech intelligibility improvement method using maximizing Mutual Information measure
Regular paper
Imam Khomeini International University
Tuesday 2 june, 2015, 17:40 - 18:00
0.3 Copenhagen (49)
Abstract:
We propose a novel speech pre-processing algorithm for speech
intelligibility improvement in noisy environments. The speech
intelligibility improvement algorithms are often employed in Public Address
Systems (PAS) where a clean audio message is played by loudspeakers through
the public place. The public place could be a train station, stadium or an
airport. These algorithms modify the clean signal such that it would be more
intelligible for the listener in the presence of additive background noise.
Our proposed method uses an Objective Intelligibility Measure (OIM) to
obtain optimal parameters for energy redistributing of the clean signal in
the sub-band domain under an energy constraint. Recently, it has shown that
Mutual Information as an OIM can successfully predict speech intelligibility
[1]. Hence, our algorithm maximizes the mutual information between the
spectral envelope of the clean and noisy modified speech for energy
redistribution. Optimal parameters which are energy gains of different
frequency bands are obtained using this maximization procedure. It is also
possible to obtain these parameters adaptively in short blocks of speech. We
compare our method with a reference method [2] that uses a perceptual
distortion measure for optimally redistributing speech energy over the time
and frequency. The obtained intelligibility scores by STOI and CSII measures
in 4 different noisy conditions (babble, train, white and factory noises)
shows that our proposed algorithm provides significant gain over the
unprocessed speech signal and has higher scores in comparison with the
reference method.
[1] J. Taghia, R. Martin, and R. C. Hendriks, "On mutual information as a
measure of speech intelligibility," ICASSP, 2012, pp. 65–68.
[2] C. H. Taal, R.C. Hendriks, and R. Heusdens, "A speech preprocessing
strategy for intelligibility improvement in noise based on a perceptual
distortion measure," ICASSP, 2012, pp. 4061–4064.