Publications

Sparse Overcomplete Decomposition for Single Channel speaker Separation

In Proc. of the IEEE International Conference on Audio and Speech Signal Processing (ICASSP)

Publication date: November 17, 2007

M. Shashanka, B. Raj, Paris Smaragdis

We present an algorithm for separating multiple speakers from a mixed single channel recording. The algorithm is based on a model proposed by Raj and Smaragdis [6]. The idea is to extract certain characteristic spectro-temporal basis functions from training data for individual speakers and decompose the mixed signals as linear com- binations of these learned bases. In other words, their model ex- tracts a compact code of basis functions that can explain the space spanned by spectral vectors of a speaker. In our model, we generate a sparse-distributed code where we have more basis functions than the dimensionality of the space. We propose a probabilistic frame- work to achieve sparsity. Experiments show that the resulting sparse code better captures the structure in data and hence leads to better separation.

Learn More

Research Areas:  Adobe Research iconAI & Machine Learning Adobe Research iconAudio