Convolutive Speech Bases and Their Application to Speech Separation

IEEE Transaction on Audio, Speech and Language Processing , 15, 1–12

Published March 20, 2007

Paris Smaragdis

In this paper we present a convolutive basis decom- position method and its application on simultaneous speakers separation from monophonic recordings. The model we propose is a convolutive version of the non-negative matrix factorization algorithm. Due to the non-negativity constraint this type of coding is very well suited for intuitively and efficiently representing magnitude spectra. We present results that reveal the nature of these basis functions and we introduce their utility in separating monophonic mixtures of known speakers.

Learn More

Research Area:  Audio