Loading...
Loading...

Go to the content (press return)

Dealing with input noise in statistical machine translation

Author
Formiga, L.; Fonollosa, José A. R.
Type of activity
Presentation of work at congresses
Name of edition
24th International Conference on Computational Linguistics
Date of publication
2012
Presentation's date
2012-12-13
Book of congress proceedings
Proceedings of COLING 2012: Technical Papers : 8-15 December 2012, Mumbai, India
First page
319
Last page
328
Project funding
BUCEADOR
Feedback Analysis for User adaptive Statistical Translation
Repository
http://hdl.handle.net/2117/18279 Open in new window
URL
http://aclweb.org/anthology-new/C/C12/C12-2032.pdf Open in new window
Abstract
Misspelled words have a direct impact on the final quality obtained by Statistical Machine Translation (SMT) systems as the input becomes noisy and unpredictable. This paper presents some improvement strategies for translating real-life noisy input. The proposed strategies are based on a preprocessing step consisting in a character-based translator (MT) from noisy into cleaned text. The use of a character-level translator allows us to provide various spelling alternatives in a lattice format to ...
Citation
Formiga, L.; Fonollosa, José A. R. Dealing with input noise in statistical machine translation. A: International Conference on Computational Linguistics. "Proceedings of COLING 2012: Technical Papers : 8-15 December 2012, Mumbai, India". Mumbai: 2012, p. 319-328.
Group of research
IDEAI-UPC - Intelligent Data Science and Artificial Intelligence Research Center
TALP - Centre for Language and Speech Technologies and Applications
VEU - Speech Processing Group

Participants

Attachments