Go to the content (press return)

A machine-learned universal language representation

Total activity: 2
Type of activity
Competitive project
Funding entity
Funding entity code
74.850,00 €
Start date
End date
aprendizaje profundo, deep learning, información lingüística, linguistic information, machine translation, representación universal del lenguaje, traducción automática, universal language representation
Why is machine translation between English and Portuguese significantly better than machine translation between Dutch and Spanish? Why do speech
recognizers work better in German than Finnish? The main problem is the insufficient amount of labelled data for training in both cases. Although the
world is multimodal and highly multilingual, speech and language technology is not keeping up with the demand in all languages. We need better
learning methods that exploit the advancements of a few modalities and languages for the benefit of others. This proposal addresses the low-resources
problem and the expensive approach to multilingual machine translation since systems for all translation pairs are required.
AMALEU proposes to jointly learn a multilingual and multimodal model that builds upon a universal language representation. This model will compensate
the lack of supervised data and significantly increase the system capacity of generalization from training data given the unconventional variety of
employed resources. This model will reduce the number of required translation systems from quadratic to linear, which will have a high impact in a
multilingual environment.
The high-risk/high-gain relies on automatically training a universal language by specifically designed deep learning algorithms. AMALEU will employ an
encoder-decoder architecture. The encoder represents an abstraction of an input by reducing its dimensionality, which will become the proposed
universal language; from this abstraction, the decoder generates the output. The encoder-decoder internal architecture will be explicitly designed for
learning the universal language, which will be appropriately integrated as an objective function of the architecture.
AMALEU will impact highly multidisciplinary communities of specialists in computer science, mathematics, engineering and linguistics who work on
natural language understanding, natural language and speech processing applications.
Adm. Estat
Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020
Call year
Funcding program
Programa Estatal de Fomento de la Investigación Científica y Técnica de Excelencia
Funding subprogram
Subprograma Estatal de Generación de Conocimiento
Funding call
Acciones de dinamización/'Proyectos Europa Excelencia/'
Grant institution
Agencia Estatal De Investigacion


Scientific and technological production

1 to 2 of 2 results