Noise-Robust Dynamic Time Warping Using PLCA Features

ICASSP - IEEE International Conference on Acoustics, Speech, and Signal Processing , March 2012

Published March 25, 2012

B. King, Paris Smaragdis, Gautham Mysore

Conventional speech features, such as mel-frequency cepstral coefficients, tend to perform well in template matching systems, such as dynamic time warping, in low noise conditions. However, they tend to degrade in noisy environments. We propose a method of calculating features using the probabilistic latent component anal- ysis (PLCA) framework. This framework models the speech and noise separately, leading to higher performance in noisy conditions than conventional methods. In this work, we compare our PLCA- based features with conventional features on the task of aligning a high-fidelity speech recording to a noisy speech recording, a scenario common in automatic dialogue replacement.

Learn More

Research Area:  Audio