Embedding enhancement information in the speech signal

  • Position identifier: ESR5
  • Host partner: UEDIN


Speech becomes harder to understand in the presence of noise and other distortions, such as telephone channels. This is especially true for people with a hearing impairment. It is difficult to enhance the intelligibility of a received speech+noise mixture, or of distorted speech, even with the relatively sophisticated enhancement algorithms that modern hearing aids are capable of running. A clever way around this problem might be for the sender to add extra information to the original speech signal, before noise or distortion is added. The receiver (e.g., a hearing aid) would use this to assist speech enhancement.

The objectives of this project are to:

  • Discover what additional information would be most effective to assist enhancement algorithms running hearing aids, e.g., a highly-robust voice activity signal;
  • Compare technological interventions in simulated real-life environments for a variety of listeners, using the speech signal to carry this information overtly (‘pre-enhanced speech’) or covertly (using audio watermarking techniques to send side information).

The methodology and speech materials for making these comparisons will be provided by other researchers in the project. The target receiving device is a hearing aid, but the techniques could be applied to other situations where a clean speech signal is mixed with noise; e.g., a television set could perform intelligibility enhancement of a speech + music mixture, assisted by side information carried within the audio signal (without requiring changes in broadcast standards), or use that side information to render visual cues to the listener.

The project will be supervised by Prof. Simon King and will be carried our in collaboration with partners Hoerzentrum (Germany), Fraunhofer Institute for Digital Media (Germany) and Voxygen (France).

Essential requirements

  • An undergraduate or masters degree in a relevant discipline (e.g., Speech and Language Processing, Engineering, Computer Science).
  • Programming skills (Python, Matlab, or another appropriate language)
  • Basic knowledge of speech signal processing

Desirable requirements

  • Knowledge of speech perception, including the effects of hearing impairment
  • Knowledge of speech synthesis, especially statistical parametric techniques
  • Advanced speech signal processing skills
  • Knowledge and experience of statistical analysis techniques and the ability to implement them (e.g., in R)

This position will be re-advertised in 2017 for a start date prior to end September 2017. When advertised, to apply for this position, you will need to do two things:

  1. Apply for this job – The position lasts for 36 months from the actual start date.
  2. Apply for entry to our PhD programme, following these instructions