A symmetric kernel partial least squares framework for speaker recognition

IEEE Transactions on Audio, Speech and Language Processing

Published March 20, 2013

Balaji Vasan Srinivasan, Y. Luo, D. Garcia-Romero, D. Zotkin, R. Duraiswami

I-vectors are concise representations of speaker characteristics. Recent progress in i-vectors related research has utilized their ability to capture speaker and channel variability to develop efficient automatic speaker verification (ASV) systems. Inter-speaker relationships in the i-vector space are non-linear. Accomplishing effective speaker verification requires a good modeling of these non-linearities and can be cast as a machine learning problem. Kernel partial least squares (KPLS) can be used for discriminative training in the i-vector space. However, this framework suffers from training data imbalance and asymmetric scoring. We use “one shot similarity scoring” (OSS) to address this. The resulting ASV system (OSS-KPLS) is tested across several conditions of the NIST SRE 2010 extended core data set and compared against state-of-the-art systems: Joint Factor Analysis (JFA), Probabilistic Linear Discriminant Analysis (PLDA), and Cosine Distance Scoring (CDS) classifiers. Improvements are shown.

Learn More

Research Areas:  AI & Machine Learning Audio