Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation

Proceedings of the International Conference on Machine Learning (ICML)

Publication date: June 26, 2012

Gautham Mysore, Maneesh Sahani

The past decade has seen substantial work on the use of non-negative matrix factorization and its probabilistic counterparts for audio source separation. Although able to capture audio spectral structure well, these models neglect the non-stationarity and temporal dynamics that are important properties of audio. The recently proposed non-negative factorial hidden Markov model (N-FHMM) introduces a temporal dimension and improves source separation performance. However, the factorial nature of this model makes the complexity of inference exponential in the number of sound sources. Here, we present a Bayesian variant of the N-FHMM suited to an efficient variational inference algorithm, whose complexity is linear in the number of sound sources. Our algorithm performs comparably to exact inference in the original NFHMM but is significantly faster. In typical configurations of the N-FHMM, our method achieves around a 30x increase in speed.

Research Areas:  Adobe Research iconAI & Machine Learning Adobe Research iconAudio