Speaker and Noise Independent Online Single Channel Speech Enhancement

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Publication date: April 19, 2015

Francois Germain, Gautham Mysore

Desirable properties of real-world speech enhancement methods include online operation, single-channel operation, operation in the presence of a variety of noise types including non-stationary noise, and no requirement for isolated training examples of the specific speaker and noise type at hand. Methods in the literature typically possess only a subset of these properties. Source separation methods particularly rarely simultaneously possess the first and last properties. We extend universal speech model based speech enhancement to adaptively learn a noise model in an online fashion. We learn a model from a general corpus of speech in place of speaker dependent training examples before deployment. This setup provides all of these desirable properties, making it easy to deploy in real-world systems without the need to provide additional training examples, while explicitly modeling speech. Our experimental results show that our method achieves the same performance as in the case in which speaker-dependent training data is available.

Research Areas:  Adobe Research iconAI & Machine Learning Adobe Research iconAudio