TY - MGZN AU - Costa-jussà, Marta R. AU - Fonollosa, José A. R. AU - Mariño, J.B. AU - Poch, M. AU - Farrus, M. T2 - Computing and informatics Y1 - 2014 VL - 33 IS - 4 SP - 907 EP - 920 UR - http://cai.type.sk/content/2014/4/a-large-spanish-catalan-parallel-corpus-release-for-machine-translation/ AB - We present a large Spanish-Catalan parallel corpus extracted from ten years of the paper edition of a bilingual Catalan newspaper. The produced corpus of 7.5 M parallel sentences (around 180 M words per language) is useful for many natural language applications. We report excellent results when building a statistical machine translation system trained on this parallel corpus. The Spanish-Catalan corpus is partially available via ELDA (Evaluations and Language Resources Distribution Agency) in catalog number ELRA-W0053. TI - A large Spanish-Catalan parallel corpus release for machine translation ER -