Loading...
Loading...

Go to the content (press return)

Advanced deep learning architectures for speech, audio and language processing

Total activity: 10
Type of activity
Competitive project
Acronym
ADAVOICE
Funding entity
AGENCIA ESTATAL DE INVESTIGACION
Funding entity code
PID2019-107579RB-I00
Amount
271.282,00 €
Start date
2020-06-01
End date
2024-05-31
Keywords
acoustic event detection, aprendizaje profundo, conversión de texto a voz, deep learning, deep neural networks, detección de eventos acústicos, machine translation, reconocimiento del habla, reconocimiento del locutor, redes neuronales profundas, speaker recognition, speech recognition, speech technology, tecnologías del habla, text to speech, traducción automática
Abstract
Speech, Audio and Language Processing are largely benefiting from deep learning architectures to achieve high levels of performance.
Deep learning are the set of algorithms that allow to learn different levels of abstraction from given data. These algorithms have achieved
great success in supervised environments. Supervised means that we have labelled data for training purposes. Deep learning algorithms
typically need large quantities of labelled data to perform a task. Different architectures of these algorithms are combined and
concatenated depending on the goal of the task. These can include recurrent neural networks that excel at modeling variable-length
sequences, and convolutional neural networks that have typically been used to extract patterns from images. However, much more
complex architectures such as the Transformer, which combine attention mechanisms and feed-forward networks, are so versatile that are
able to succeed in multiple tasks.
The purpose of this project is to focus on remaining challenges of advanced deep learning architectures in the context of speech, audio
and language processing by continuing the intense research of our group. The project proposes to tackle big challenges in multilingual and
multimodal machine translation, speaker recognition, natural language processing and speech regeneration. In machine translation, the project is dedicated to unsupervised and multilingual machine translation. On the one hand, while machine
translation has classically been trained using parallel data at the level of sentence, it is possible to train using only monolingual data. On
the other hand, multilingual machine translation if dealt pairwise, it can be computationally very expensive. This project proposes to use
language independent encoders and decoders that can be trained on monolingual data. For this purpose, we need to work towards an
automatically extracted language-independent representation, which has been typically been identified as an interlingua. Beyond machine
translation, we propose to face speaker recognition, speech and audio translation in an end-to-end fashion.
Regarding natural language processing, the project is oriented towards solving general deep learning limitations which are fairness and
generalization. The high performance of deep learning is overshadowed by issues like unfair behaviours, which are typically arising from
demographic biases (e.g., she is a doctor when translated to Turkish and then back-translated to English, becomes he is a doctor). These
biases are often learned and amplified from training data. Similarly, the lack of generalization is observed when our proposed systems only
learn compositions that have been observed in the training data. One way to tackle these problems is by properly making use of linguistic
information. This project proposes to make use of dictionaries, dependency trees and semantic resources (e.g. Wordnet, Babelnet) to
retrofit deep learning algorithms.
Finally, the work in speech regeneration includes unsupervised speech problem agnostic representations as well as introducing generative
adversarial networks to improve the quality of the synthetic voice in voice conversion and speech enhancement applications.
Given the high interest both at the academy and industry level, this project will share results and benefit from multiple co-operations
including universities and companies as supported by the corresponding letters of interest.
Scope
Adm. Estat
Plan
PLAN ESTATAL DE INVESTIGACIÓN CIENTÍFICA Y TÉCNICA Y DE INNOVACIÓN 2017-2020
Resoluton year
2020
Funcding program
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS RETOS DE LA SOCIEDAD
Funding call
RETOS DE INVESTIGACIÓN: PROYECTOS DE I+D+I
Grant institution
Agencia Estatal De Investigacion

Participants

Scientific and technological production

1 to 10 of 10 results