Loading...
Loading...

Go to the content (press return)

Self multi-head attention for speaker recognition

Author
India, M.; Safari, P.; Hernando, J.
Type of activity
Presentation of work at congresses
Name of edition
20th Annual Conference of the International Speech Communication Association
Date of publication
2019
Presentation's date
2019-09-16
Book of congress proceedings
Interspeech 2019: the 20th Annual Conference of the International Speech Communication Association: 15-19 September 2019: Graz, Austria
First page
4305
Last page
4309
Publisher
International Speech Communication Association (ISCA)
DOI
10.21437/Interspeech.2019-2616
Project funding
Deep learning technologies for speech and audio processing
Repository
http://hdl.handle.net/2117/178623 Open in new window
URL
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2616.pdf Open in new window
Abstract
Most state-of-the-art Deep Learning (DL) approaches forspeaker recognition work on a short utterance level. Given thespeech signal, these algorithms extract a sequence of speakerembeddings from short segments and those are averaged to ob-tain an utterance level speaker representation. In this work wepropose the use of an attention mechanism to obtain a discrim-inative speaker embedding given non fixed length speech utter-ances. Our system is based on a Convolutional Neural Network(CNN) that enco...
Citation
India, M.; Safari, P.; Hernando, J. Self multi-head attention for speaker recognition. A: Annual Conference of the International Speech Communication Association. "Interspeech 2019: the 20th Annual Conference of the International Speech Communication Association: 15-19 September 2019: Graz, Austria". Baixas: International Speech Communication Association (ISCA), 2019, p. 4305-4309.
Keywords
Attention models, Multi-head self attention, Speaker embeddings, Speaker verification
Group of research
IDEAI-UPC - Intelligent Data Science and Artificial Intelligence Research Center
TALP - Centre for Language and Speech Technologies and Applications
VEU - Speech Processing Group

Participants

Attachments