Kernel partial least squares framework for speaker recognition

12th Annual Conference of the International Speech Communication Association (INTERSPEECH)

Published August 27, 2011

Balaji Vasan Srinivasan, D. Garcia-Romero, D. Zotkin, R. Duraiswami

I-vectors are a concise representation of speaker characteristics. Recent advances in speaker recognition have utilized their ability to capture speaker and channel variability to develop efficient recognition engines. Inter-speaker relationships in the i-vector space are non-linear. Accomplishing effective speaker recognition requires a good modeling of these non-linearities and can be cast as a machine learning problem. In this paper, we propose a kernel partial least squares (kernel PLS, or KPLS) framework for modeling speakers in the i-vectors space. The resulting recognition system is tested across several conditions of the NIST SRE 2010 extended core data set and compared against state-of-the-art systems: Joint Factor Analysis (JFA), Probabilistic Linear Discriminant Analysis (PLDA), and Cosine Distance Scoring (CDS) classifiers. Improvements are shown.

Learn More

Research Areas:  AI & Machine Learning Audio