Kernel partial least squares framework for speaker recognition

I-vectors are a concise representation of speaker characteristics. Recent advances in speaker recognition have utilized their ability to capture speaker and channel variability to develop efficient recognition engines. Inter-speaker relationships in the i-vector space are non-linear. Accomplishing effective speaker recognition requires a good modeling of these non-linearities and can be cast as a machine learning problem. In this paper, we propose a kernel partial least squares (kernel PLS, or KPLS) framework for modeling speakers in the i-vectors space. The resulting recognition system is tested across several conditions of the NIST SRE 2010 extended core data set and compared against state-of-the-art systems: Joint Factor Analysis (JFA), Probabilistic Linear Discriminant Analysis (PLDA), and Cosine Distance Scoring (CDS) classifiers. Improvements are shown.

Learn More

Publications

Kernel partial least squares framework for speaker recognition

12th Annual Conference of the International Speech Communication Association (INTERSPEECH)

Publication date: August 27, 2011

Balaji Vasan Srinivasan, D. Garcia-Romero, D. Zotkin, R. Duraiswami

Research Areas: AI & Machine Learning Audio