Embedding enhancement information in the speech signal

  • Position identifier: ESR5
  • Host partner: UEDIN

Objectives

Speech becomes harder to understand in the presence of noise and other distortions, such as telephone channels. This is especially true for people with a hearing impairment. It is difficult to enhance the intelligibility of a received speech+noise mixture, or of distorted speech, even with the relatively sophisticated enhancement algorithms that modern hearing aids are capable of running. A clever way around this problem might be for the sender to add extra information to the original speech signal, before noise or distortion is added. The receiver (e.g., a hearing aid) would use this to assist speech enhancement.

The objectives of this project were to:

  • Discover what additional information would be most effective to assist enhancement algorithms running hearing aids, e.g., a highly-robust voice activity signal;
  • Compare technological interventions in simulated real-life environments for a variety of listeners, using the speech signal to carry this information overtly (‘pre-enhanced speech’) or covertly (using audio watermarking techniques to send side information).

The methodology and speech materials for making these comparisons was provided by other researchers in the project. The target receiving device is a hearing aid, but the techniques could be applied to other situations where a clean speech signal is mixed with noise; e.g., a television set could perform intelligibility enhancement of a speech + music mixture, assisted by side information carried within the audio signal (without requiring changes in broadcast standards), or use that side information to render visual cues to the listener.

The project was supervised by Prof. Simon King and will be carried our in collaboration with partners Hoerzentrum (Germany), Fraunhofer Institute for Digital Media (Germany) and Voxygen (France).