Carregant...
Carregant...

Vés al contingut (premeu Retorn)

A large Spanish-Catalan parallel corpus release for machine translation

Autor
Ruiz, M.; Fonollosa, José A. R.; Mariño, J.B.; Poch, M.; Farrus, M.
Tipus d'activitat
Article en revista
Revista
Computing and informatics
Data de publicació
2014-01-01
Volum
33
Número
4
Pàgina inicial
907
Pàgina final
920
URL
http://cai.type.sk/content/2014/4/a-large-spanish-catalan-parallel-corpus-release-for-machine-translation/ Obrir en finestra nova
Resum
We present a large Spanish-Catalan parallel corpus extracted from ten years of the paper edition of a bilingual Catalan newspaper. The produced corpus of 7.5 M parallel sentences (around 180 M words per language) is useful for many natural language applications. We report excellent results when building a statistical machine translation system trained on this parallel corpus. The Spanish-Catalan corpus is partially available via ELDA (Evaluations and Language Resources Distribution Agency) in ca...
Paraules clau
Catalan-Spanish parallel corpus, machine translation
Grup de recerca
IDEAI-UPC Intelligent Data Science and Artificial Intelligence
TALP - Centre de Tecnologies i Aplicacions del Llenguatge i la Parla
VEU - Grup de Tractament de la Parla

Participants