Loading...
Loading...

Go to the content (press return)

Deep learning backend for single and multisession i-vector speaker recognition

Author
Ghahabi, O.; Hernando, J.
Type of activity
Journal article
Journal
IEEE-ACM Transactions on Audio Speech and Language Processing
Date of publication
2017-04-01
Volume
25
Number
4
First page
807
Last page
817
DOI
https://doi.org/10.1109/TASLP.2017.2661705 Open in new window
Repository
http://hdl.handle.net/2117/104282 Open in new window
URL
http://ieeexplore.ieee.org/document/7847321/?reload=true Open in new window
Abstract
The lack of labeled background data makes a big performance gap between cosine and Probabilistic Linear Discriminant Analysis (PLDA) scoring baseline techniques for i-vectors in speaker recognition. Although there are some unsupervised clustering techniques to estimate the labels, they cannot accurately predict the true labels and they also assume that there are several samples from the same speaker in the background data that could not be true in reality. In this paper, the authors make use of ...
Citation
Ghahabi, O., Hernando, J. Deep learning backend for single and multisession i-vector speaker recognition. "IEEE-ACM Transactions on Audio Speech and Language Processing", 1 Abril 2017, vol. 25, núm. 4, p. 807-817.
Keywords
Deep belief network, Deep learning, Deep neural network, I-vector, speaker recognition
Group of research
IDEAI-UPC - Intelligent Data Science and Artificial Intelligence Research Center
TALP - Centre for Language and Speech Technologies and Applications
VEU - Speech Processing Group

Participants

Attachments