Loading...
Loading...

Go to the content (press return)

I-vector transformation using k-nearest neighbors for speaker verification

Author
Khan, U.; India, M.; Hernando, J.
Type of activity
Presentation of work at congresses
Name of edition
2020 IEEE International Conference on Acoustics, Speech, and Signal Processing
Date of publication
2020
Presentation's date
2020-05-08
Book of congress proceedings
2020 IEEE International Conference on Acoustics, Speech,and Signal Processing: proceedings: May 4-8, 2020: Centre de Convencions Internacional de Barcelona (CCIB) Barcelona, Spain
First page
7574
Last page
7578
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
DOI
10.1109/ICASSP40776.2020.9053504
Project funding
Deep learning technologies for speech and audio processing
Repository
http://hdl.handle.net/2117/192485 Open in new window
URL
https://ieeexplore.ieee.org/abstract/document/9053504 Open in new window
Abstract
Probabilistic Linear Discriminant Analysis (PLDA) is the most efficient backend for i-vectors. However, it requires labeled background data which can be difficult to access in practice. Unlike PLDA, cosine scoring avoids speaker-labels at the cost of degrading the performance. In this work, we propose a post processing of i-vectors using a Deep Neural Network (DNN) to transform i-vectors into a new speaker vector representation. The DNN will be trained using i-vectors that are similar to the tra...
Citation
Khan, U.; India, M.; Hernando, J. I-vector transformation using k-nearest neighbors for speaker verification. A: IEEE International Conference on Acoustics, Speech, and Signal Processing. "2020 IEEE International Conference on Acoustics, Speech,and Signal Processing: proceedings: May 4-8, 2020: Centre de Convencions Internacional de Barcelona (CCIB) Barcelona, Spain". Institute of Electrical and Electronics Engineers (IEEE), 2020, p. 7574-7578.
Keywords
Deep learning, I-vectors, K-nearest neighbors, Speaker verification
Group of research
IDEAI-UPC - Intelligent Data Science and Artificial Intelligence Research Center
TALP - Centre for Language and Speech Technologies and Applications
VEU - Speech Processing Group

Participants