Graphic summary
  • Show / hide key
  • Information


Scientific and technological production
  •  

1 to 50 of 144 results
  • FAUST. Open source release of the final Asiya Suite

     Gonzalez Bermudez, Meritxell; Mascarell, Laura; Màrquez Villodre, Lluís
    Date: 2013-02-18
    Report

     Share Reference managers Reference managers Open in new window

  • Selectional preferences for semantic role classification

     Zapirain, Beñat; Agirre, Eneko; Màrquez Villodre, Lluís; Surdeanu, Mihai
    COMPUTATIONAL LINGUISTICS
    Date of publication: 2013-09
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    This paper focuses on a well-known open issue in Semantic Role Classification (SRC) research: the limited influence and sparseness of lexical features. We mitigate this problem using models that integrate automatically learned selectional preferences (SP). We explore a range of models based onWordNet and distributional-similarity SPs. Furthermore, we demonstrate that the SRC task is better modeled by SP models centered on both verbs and prepositions, rather than verbs alone. Our experiments with SP-based models in isolation indicate that they outperform a lexical baseline with 20 F1 points in domain and almost 40 F1 points out of domain. Furthermore, we show that a state-of-the-art SRC system extended with features based on selectional preferences performs significantly better, both in domain (17% error reduction) and out of domain (13% error reduction). Finally, we show that in an end-to-end semantic role labeling system we obtain small but statistically significant improvements, even though our modified SRC model affects only approximately 4% of the argument candidates. Our post hoc error analysis indicates that the SP-based features help mostly in situations where syntactic information is either incorrect or insufficient to disambiguate the correct role

  • Coreference resolution: an empirical study based on SemEval-2010 shared Task 1

     Màrquez Villodre, Lluís; Recasens Ruiz, Marta; Sapena Masip, Emilio
    Language resources and evaluation
    Date of publication: 2013
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    This paper presents an empirical evaluation of coreference resolution that covers several interrelated dimensions. The main goal is to complete the comparative analysis from the SemEval-2010 task on Coreference Resolution in Multiple Languages. To do so, the study restricts the number of languages and systems involved, but extends and deepens the analysis of the system outputs, including a more qualitative discussion. The paper compares three automatic coreference resolution systems for three languages (English, Catalan and Spanish) in four evaluation settings, and using four evaluation measures. Given that our main goal is not to provide a comparison between resolution algorithms, these are merely used as tools to shed light on the different conditions under which coreference resolution is evaluated. Although the dimensions are strongly interdependent, making it very difficult to extract general principles, the study reveals a series of interesting issues in relation to coreference resolution: the portability of systems across languages, the influence of the type and quality of input annotations, and the behavior of the scoring measures.

  • Joint arc-factored parsing of syntactic and semantic dependencies

     Lluis Martorell, Xavier; Carreras Perez, Xavier; Màrquez Villodre, Lluís
    Transactions of the Association for Computational Linguistics (TACL)
    Date of publication: 2013-05
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    In this paper we introduce a joint arc-factored model for syntactic and semantic dependency parsing. The semantic role labeler predicts the full syntactic paths that connect predicates with their arguments. This process is framed as a linear assignment task, which allows to control some well-formedness constraints. For the syntactic part, we define a standard arc-factored dependency model that predicts the full syntactic tree. Finally, we employ dual decomposition techniques to produce consistent syntactic and predicate-argument structures while searching over a large space of syntactic configurations. In experiments on the CoNLL-2009 English benchmark we observe very competitive results.

    In this paper we introduce a joint arc-factored model for syntactic and semantic dependency parsing. The semantic role labeler predicts the full syntactic paths that connect predicates with their arguments. This process is framed as a linear assignment task, which allows to control some well-formedness constraints. For the syntactic part, we define a standard arc-factored dependency model that predicts the full syntactic tree. Finally, we employ dual decomposition techniques to produce consistent syntactic and predicate-argument structures while searching over a large space of syntactic configurations. In experiments on the CoNLL-2009 English benchmark we observe very competitive results.

  • MOLTO. Patent MT and retrieval. Final report.

     Mateva, Maria; Gonzalez Bermudez, Meritxell; Enache, Ramona; España Bonet, Cristina; Màrquez Villodre, Lluís; Popov, Borislav; Ranta, Aarne
    Date: 2013-03-01
    Report

     Share Reference managers Reference managers Open in new window

  • MOLTO. Final report: statistical and robust MT

     España Bonet, Cristina; Enache, Ramona; Angelov, Krasimir; Virk, Shafqat; Galgoczy, Érzsebet; Gonzalez Bermudez, Meritxell; Ranta, Aarne; Màrquez Villodre, Lluís
    Date: 2013-04-08
    Report

     Share Reference managers Reference managers Open in new window

  • Traducción automática en contexto y aumentada con recursos dinámicos de Internet UPC (TACARDI)

     Màrquez Villodre, Lluís; Castell Ariño, Nuria
    Participation in a competitive project

     Share

  • TACARDI: Traducción Automática en Contexto y Aumentada con Recursos Dinámicos de Internet

     Màrquez Villodre, Lluís; Castell Ariño, Nuria
    Participation in a competitive project

     Share

  • Access to the full text
    tSEARCH: flexible and fast search over automatic translations for improved quality/error analysis  Open access

     Gonzalez Bermudez, Meritxell; Mascarell, Laura; Màrquez Villodre, Lluís
    Annual Meeting of the Association for Computational Linguistics
    Presentation's date: 2013-08-06
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This work presents tSEARCH, a web-based application that provides mechanisms for doing complex searches over a collection of translation cases evaluated with a large set of diverse measures. tSEARCH uses the evaluation results obtained with the ASIYA toolkit for MT evaluation and it is connected to its on-line GUI, which makes possible a graphical visualization and interactive access to the evaluation results. The search engine offers a flexible query language allowing to find translation examples matching a combination of numerical and structural features associated to the calculation of the quality metrics. Its database design permits a fast response time for all queries supported on realistic-size test beds. In summary, tSEARCH, used with ASIYA, offers developers of MT systems and evaluation metrics a powerful tool for helping translation and error analysis.

    This work presents tSEARCH, a web-based application that provides mechanisms for doing complex searches over a collection of translation cases evaluated with a large set of diverse measures. tSEARCH uses the evaluation results obtained with the ASIYA toolkit for MT evaluation and it is connected to its on-line GUI, which makes possible a graphical visualization and interactive access to the evaluation results. The search engine offers a flexible query language allowing to find translation examples matching a combination of numerical and structural features associated to the calculation of the quality metrics. Its database design permits a fast response time for all queries supported on realistic-size test beds. In summary, tSEARCH, used with ASIYA, offers developers of MT systems and evaluation metrics a powerful tool for helping translation and error analysis.

  • Access to the full text
    Real-life translation quality estimation for MT system selection  Open access

     Formiga Fanals, Lluis; Màrquez Villodre, Lluís; Pujantell Traserra, Jaume
    Machine Translation Summit
    Presentation's date: 2013-09-04
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Research on translation quality annotation and estimation usually makes use of standard language, sometimes related to a specific language genre or domain. However, real-life machine translation (MT), performed for instance by on-line translation services, has to cope with some extra dif- ficulties related to the usage of open, non-standard and noisy language. In this paper we study the learning of quality estimation (QE) models able to rank translations from real-life input according to their goodness without the need of translation references. For that, we work with a corpus collected from the 24/7 Reverso.net MT service, translated by 5 different MT systems, and manually annotated with quality scores. We define several families of features and train QE predictors in the form of regressors or direct rankers. The predictors show a remarkable correlation with gold standard rankings and prove to be useful in a system combination scenario, obtaining better results than any individual translation system.

    Research on translation quality annotation and estimation usually makes use of standard language, sometimes related to a specific language genre or domain. However, real-life machine translation (MT), performed for instance by on-line translation services, has to cope with some extra dif- ficulties related to the usage of open, non-standard and noisy language. In this paper we study the learning of quality estimation (QE) models able to rank translations from real-life input according to their goodness without the need of translation references. For that, we work with a corpus collected from the 24/7 Reverso.net MT service, translated by 5 different MT systems, and manually annotated with quality scores. We define several families of features and train QE predictors in the form of regressors or direct rankers. The predictors show a remarkable correlation with gold standard rankings and prove to be useful in a system combination scenario, obtaining better results than any individual translation system.

  • Access to the full text
    MT techniques in a retrieval system of semantically enriched patents  Open access

     Gonzalez Bermudez, Meritxell; Mateva, Maria; Enache, Ramona; España Bonet, Cristina; Màrquez Villodre, Lluís; Popov, Borislav; Ranta, Aarne
    Machine Translation Summit
    Presentation's date: 2013-09-04
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This paper focuses on how automatic translation techniques integrated in a patent retrieval system increase its capabilities and make possible extended features and functionalities. We describe 1) a novel methodology for natural language to SPARQL translation based on a grammar¿ ontology interoperability automation and a query grammar for the patents domain; 2) a devised strategy for statisticalbased translation of patents that allows to transfer semantic annotations to the target language; 3) a built-in knowledge representation infrastructure that uses multilingual semantic annotations; and 4) an online application that offers a multilingual search interface over structural knowledge databases (domain ontologies) and multilingual documents (biomedical patents) that have been automatically translated.

    This paper focuses on how automatic translation techniques integrated in a patent retrieval system increase its capabilities and make possible extended features and functionalities. We describe 1) a novel methodology for natural language to SPARQL translation based on a grammar– ontology interoperability automation and a query grammar for the patents domain; 2) a devised strategy for statisticalbased translation of patents that allows to transfer semantic annotations to the target language; 3) a built-in knowledge representation infrastructure that uses multilingual semantic annotations; and 4) an online application that offers a multilingual search interface over structural knowledge databases (domain ontologies) and multilingual documents (biomedical patents) that have been automatically translated.

  • Access to the full text
    Identifying useful human correction feedback from an on-line machine translation service  Open access

     Barron Cedeño, Luis Alberto; Màrquez Villodre, Lluís; Henriquez, Carlos A; Formiga Fanals, Lluis; Romero Merino, Enrique; May, Jonathan
    International Joint Conference on Artificial Intelligence
    Presentation's date: 2013-08
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Post-editing feedback provided by users of on-line translation services offers an excellent opportunity for automatic improvement of statistical machine translation (SMT) systems. However, feedback provided by casual users is very noisy, and must be automatically filtered in order to identify the potentially useful cases. We present a study on automatic feedback filtering in a real weblog collected from Reverso.net. We extend and re-annotate a training corpus, define an extended set of simple features and approach the problem as a binary classification task, experimenting with linear and kernelbased classifiers and feature selection. Results on the feedback filtering task show a significant improvement over the majority class, but also a precision ceiling around 70-80%. This reflects the inherent difficulty of the problem and indicates that shallow features cannot fully capture the semantic nature of the problem. Despite the modest results on the filtering task, the classifiers are proven effective in an application-based evaluation. The incorporation of a filtered set of feedback instances selected from a larger corpus significantly improves the performance of a phrase-based SMT system, according to a set of standard evaluation metrics.

    Post-editing feedback provided by users of on-line translation services offers an excellent opportunity for automatic improvement of statistical machine translation (SMT) systems. However, feedback provided by casual users is very noisy, and must be automatically filtered in order to identify the potentially useful cases. We present a study on automatic feedback filtering in a real weblog collected from Reverso.net. We extend and re-annotate a training corpus, define an extended set of simple features and approach the problem as a binary classification task, experimenting with linear and kernelbased classifiers and feature selection. Results on the feedback filtering task show a significant improvement over the majority class, but also a precision ceiling around 70-80%. This reflects the inherent difficulty of the problem and indicates that shallow features cannot fully capture the semantic nature of the problem. Despite the modest results on the filtering task, the classifiers are proven effective in an application-based evaluation. The incorporation of a filtered set of feedback instances selected from a larger corpus significantly improves the performance of a phrase-based SMT system, according to a set of standard evaluation metrics.

  • The TALP-UPC phrase-based translation systems for WMT13: system combination with morphology generation, domain adaptation and corpus filtering

     Formiga Fanals, Lluis; Ruiz Costa-Jussà, Marta; Mariño Acebal, Jose Bernardo; Rodríguez Fonollosa, José Adrián; Barron Cedeño, Luis Alberto; Màrquez Villodre, Lluís
    Workshop on Statistical Machine Translation
    Presentation's date: 2013-08-08
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    This paper describes the TALP participation in the WMT13 evaluation campaign. Our participation is based on the combination of several statistical machine translation systems: based on standard hrasebased Moses systems. Variations include techniques such as morphology generation, training sentence filtering, and domain adaptation through unit derivation. The results show a coherent improvement on TER, METEOR, NIST, and BLEU scores when compared to our baseline system.

  • Access to the full text
    UPC-CORE : What can machine translation evaluation metrics and Wikipedia do for estimating semantic textual similarity?  Open access

     Barron Cedeño, Luis Alberto; Màrquez Villodre, Lluís; Fuentes Fort, Maria; Rodriguez Hontoria, Horacio; Turmo Borras, Jorge
    Joint Conference on Lexical and Computational Semantics
    Presentation's date: 2013-06-13
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of metrics borrowed from automatic machine translation, originally intended to evaluate automatic against reference translations and (ii) an instance of explicit semantic analysis, built upon opening paragraphs of Wikipedia 2010 articles. Our similarity estimator relies on a support vector regressor with RBF kernel. Our best approach required 13 machine translation metrics + explicit semantic analysis and ranked 65 in the competition. Our postcompetition analysis shows that the features have a good expression level, but overfitting and ¿mainly¿ normalization issues caused our correlation values to decrease.

    In this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of metrics borrowed from automatic machine translation, originally intended to evaluate automatic against reference translations and (ii) an instance of explicit semantic analysis, built upon opening paragraphs of Wikipedia 2010 articles. Our similarity estimator relies on a support vector regressor with RBF kernel. Our best approach required 13 machine translation metrics + explicit semantic analysis and ranked 65 in the competition. Our postcompetition analysis shows that the features have a good expression level, but overfitting and —mainly— normalization issues caused our correlation values to decrease.

  • The TALP-UPC approach to system selection: ASIYA features and pairwise classification using random forests

     Formiga Fanals, Lluis; Gonzalez Bermudez, Meritxell; Barron Cedeño, Luis Alberto; Rodríguez Fonollosa, José Adrián; Màrquez Villodre, Lluís
    Workshop on Statistical Machine Translation
    Presentation's date: 2013-08-08
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    This paper describes the TALP-UPC participation in the WMT¿13 Shared Task on Quality Estimation (QE). Our participation is reduced to task 1.2 on System Selection. We used a broad set of features (86 for German-to-English and 97 for English-to-Spanish) ranging from standard QE features to features based on pseudo-references and semantic similarity. We approached system selection by means of pairwise ranking decisions. For that, we learned Random Forest classifiers especially tailored for the problem. Evaluation at development time showed considerably good results in a cross-validation experiment, with Kendall¿s values around 0.30. The results on the test set dropped significantly, raising different discussions to be taken into account.

    This paper describes the TALP-UPC participation in the WMT’13 Shared Task on Quality Estimation (QE). Our participation is reduced to task 1.2 on System Selection. We used a broad set of features (86 for German-to-English and 97 for English-to-Spanish) ranging from standard QE features to features based on pseudo-references and semantic similarity. We approached system selection by means of pairwise ranking decisions. For that, we learned Random Forest classifiers especially tailored for the problem. Evaluation at development time showed considerably good results in a cross-validation experiment, with Kendall’s values around 0.30. The results on the test set dropped significantly, raising different discussions to be taken into account.

  • Access to the full text
    Identifying useful human feedback from an on-line translation service  Open access

     Barron Cedeño, Luis Alberto; Màrquez Villodre, Lluís; Henriquez Quintana, Carlos Alberto; Formiga Fanals, Lluis; Romero Merino, Enrique; May, Jonathan
    International Joint Conference on Artificial Intelligence
    Presentation's date: 2013-08-07
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Post-editing feedback provided by users of on-line translation services offers an excellent opportunity for automatic improvement of statistical machine translation (SMT) systems. However, feedback provided by casual users is very noisy, and must be automatically filtered in order to identify the poten- tially useful cases. We present a study on automatic feedback filtering in a real weblog collected from Reverso.net. We extend and re-annotate a train- ing corpus, define an extended set of simple fea- tures and approach the problem as a binary classi- fication task, experimenting with linear and kernel- based classifiers and feature selection. Results on the feedback filtering task show a significant im- provement over the majority class, but also a preci- sion ceiling around 70-80%. This reflects the inher- ent difficulty of the problem and indicates that shal- low features cannot fully capture the semantic na- ture of the problem. Despite the modest results on the filtering task, the classifiers are proven effective in an application-based evaluation. The incorpora- tion of a filtered set of feedback instances selected from a larger corpus significantly improves the per- formance of a phrase-based SMT system, accord- ing to a set of standard evaluation metrics

    Post-editing feedback provided by users of on-line translation services offers an excellent opportunity for automatic improvement of statistical machine translation (SMT) systems. However, feedback provided by casual users is very noisy, and must be automatically filtered in order to identify the potentially useful cases. We present a study on automatic feedback filtering in a real weblog collected from Reverso.net. We extend and re-annotate a training corpus, define an extended set of simple features and approach the problem as a binary classification task, experimenting with linear and kernelbased classifiers and feature selection. Results on the feedback filtering task show a significant improvement over the majority class, but also a precision ceiling around 70-80%. This reflects the inherent difficulty of the problemand indicates that shallow features cannot fully capture the semantic nature of the problem. Despite the modest results on the filtering task, the classifiers are proven effective in an application-based evaluation. The incorporation of a filtered set of feedback instances selected from a larger corpus significantly improves the performance of a phrase-based SMT system, according to a set of standard evaluation metrics.

  • Sibyl, a factoid question answering system for spoken documents

     Comas Umbert, Pere Ramon; Turmo Borras, Jorge; Màrquez Villodre, Lluís
    ACM transactions on information systems
    Date of publication: 2012
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    In this article, we present a factoid question-answering system, Sibyl, specifically tailored for question answering (QA) on spoken-word documents. This work explores, for the first time, which techniques can be robustly adapted from the usual QA on written documents to the more difficult spoken document scenario. More specifically, we study new information retrieval (IR) techniques designed or speech, and utilize several levels of linguistic information for the speech-based QA task. These include named-entity detection with phonetic information, syntactic parsing applied to speech transcripts, and the use of coreference resolution. Sibyl is largely based on supervised machine-learning techniques, with special focus on the answer extraction step, and makes little use of handcrafted knowledge. Consequently, it should be easily adaptable to other domains and languages. Sibyl and all its modules are extensively evaluated on the European Parliament Plenary Sessions English corpus, comparing manual with automatic transcripts obtained by three different automatic speech recognition (ASR) systems that exhibit significantly different word error rates. This data belongs to the CLEF 2009 track for QA on speech transcripts. The main results confirm that syntactic information is very useful for learning to rank question candidates, improving results on both manual and automatic transcripts, unless the ASR quality is very low. At the same time, our experiments on coreference resolution reveal that the state-of-the-art technology is not mature enough to be effectively exploited for QA with spoken documents. Overall, the performance of Sibyl is comparable or better than the state-of-the-art on this corpus, confirming the validity of our approach.

  • Special issue on statistical learning of natural language structured input and output

     Màrquez Villodre, Lluís; Moscchitti, Alessandro
    Natural language engineering (Print)
    Date of publication: 2012-04
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • MOLTO. Patent MT and retrieval prototype

     Gonzalez Bermudez, Meritxell; Chechev, Milen; Damova, Mariana; Enache, Ramona; España Bonet, Cristina; Màrquez Villodre, Lluís; Mateva, Maria; Ranta, Aarne; Tolosi, Laura
    Date: 2012-09-01
    Report

     Share Reference managers Reference managers Open in new window

  • ERCIM fellowships: Alain Bensoussan Fellowship programme (grant for Luís Alberto Barrón Cedeño)

     Larrosa Bondia, Francisco Javier; Màrquez Villodre, Lluís
    Participation in a competitive project

     Share

  • Access to the full text
    Deep evaluation of hybrid architectures: use of different metrics in MERT weight optimization  Open access

     España Bonet, Cristina; Labaka, Gorka; Díaz de Ilarraza Sánchez, Arantza; Màrquez Villodre, Lluís; Sarasola, Kepa
    Free/Open-Source Rule-Based Machine Translation
    Presentation's date: 2012-06-14
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    The process of developing hybrid MT systems is usually guided by an evaluation method used to compare different combinations of basic subsystems. This work presents a deep evaluation experiment of a hybrid architecture, which combines rule-based and statistical translation approaches. Differences between the results obtained from automatic and human evaluations corroborate the inappropriateness of pure lexical automatic evaluation metrics to compare the outputs of systems that use very different translation approaches. An examination of sentences with controversial results suggested that linguistic well-formedness should be considered in the evaluation of output translations. Following this idea, we have experimented with a new simple automatic evaluation metric, which combines lexical and PoS information. This measure showed higher agreement with human assessments than BLEU in a previous study (Labaka et al., 2011). In this paper we have extended its usage throughout the system development cycle, focusing on its ability to improve parameter optimization. Results are not totally conclusive. Manual evaluation reflects a slight improvement, compared to BLEU, when using the proposed measure in system optimization. However, the improvement is too small to draw any clear conclusion. We believe that we should first focus on integrating more linguistically representative features in the developing of the hybrid system, and then go deeper into the development of automatic evaluation metrics.

    The process of developing hybrid MT systems is usually guided by an evaluation method used to compare different combinations of basic subsystems. This work presents a deep evaluation experiment of a hybrid architecture, which combines rule-based and statistical translation approaches. Differences between the results obtained from automatic and human evaluations corroborate the inappropriateness of pure lexical automatic evaluation metrics to compare the outputs of systems that use very different translation approaches. An examination of sentences with controversial results suggested that linguistic well-formedness should be considered in the evaluation of output translations. Following this idea, we have experimented with a new simple automatic evaluation metric, which combines lexical and PoS information. This measure showed higher agreement with human assessments than BLEU in a previous study (Labaka et al., 2011). In this paper we have extended its usage throughout the system development cycle, focusing on its ability to improve parameter optimization. Results are not totally conclusive. Manual evaluation reflects a slight improvement, compared to BLEU, when using the proposed measure in system optimization. However, the improvement is too small to draw any clear conclusion. We believe that we should first focus on integrating more linguistically representative features in the developing of the hybrid system, and then go deeper into the development of automatic evaluation metrics.

  • Access to the full text
    A graphical interface for MT evaluation and error analysis  Open access

     Gonzalez Bermudez, Meritxell; Giménez Lucas, Judit; Màrquez Villodre, Lluís
    Annual Meeting of the Association for Computational Linguistics
    Presentation's date: 2012-07-10
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Error analysis in machine translation is a necessary step in order to investigate the strengths and weaknesses of the MT systems under development and allow fair comparisons among them. This work presents an application that shows how a set of heterogeneous automatic metrics can be used to evaluate a test bed of automatic translations. To do so, we have set up an online graphical interface for the ASIYA toolkit, a rich repository of evaluation measures working at different linguistic levels. The current implementation of the interface shows constituency and dependency trees as well as shallow syntactic and semantic annotations, and word alignments. The intelligent visualization of the linguistic structures used by the metrics, as well as a set of navigational functionalities, may lead towards advanced methods for automatic error analysis.

  • Access to the full text
    A graph-based strategy to streamline translation quality assessments  Open access

     Pighin, Daniele; Formiga Fanals, Lluis; Màrquez Villodre, Lluís
    Conference of the Association for Machine Translation in the Americas
    Presentation's date: 2012-10-29
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    We present a detailed analysis of a graph- based annotation strategy that we employed to annotate a corpus of 11,292 real-world En- glish to Spanish automatic translations with relative (ranking) and absolute (adequate/non- adequate) quality assessments. The proposed approach, inspired by previous work in In- teractive Evolutionary Computation and Inter- active Genetic Algorithms, results in a sim- pler and faster annotation process. We em- pirically compare the method against a tra- ditional, explicit ranking approach, and show that the graph-based strategy: 1) is consider- ably faster, and 2) produces consistently more reliable annotations

  • Access to the full text
    Context-aware machine translation for software localization  Open access

     Muntés Mulero, Víctor; Paladini Adell, Patricia; España Bonet, Cristina; Màrquez Villodre, Lluís
    Conference of the European Association for Machine Translation
    Presentation's date: 2012-05-28
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Software localization requires translating short text strings appearing in user interfaces (UI) into several languages. These strings are usually unrelated to the other strings in the UI. Due to the lack of semantic context, many ambiguity problems cannot be solved during translation. However, UI are composed of several visual components to which text strings are associated. Although this association might be very valuable for word disambiguation, it has not been exploited. In this paper, we present the problem of lack of context awareness for UI localization, providing real examples and identifying the main research challenges.

  • The patents retrieval prototype in the MOLTO project

     Chechev, Milen; Gonzalez Bermudez, Meritxell; Màrquez Villodre, Lluís; España Bonet, Cristina
    International World Wide Web Conference
    Presentation's date: 2012-04-16
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    This paper describes the patents retrieval prototype developed within the MOLTO project. The prototype aims to provide a multilingual natural language interface for querying the content of patent documents. The developed system is focused on the biomedical and pharmaceutical domain and includes the translation of the patent claims and abstracts into English, French and German. Aiming at the best retrieval results of the patent information and text content, patent documents are preprocessed and semantically annotated. Then, the annotations are stored and indexed in an OWLIM semantic repository, which contains a patent specific ontology and others from different domains. The prototype, accessible online at http://molto-patents.ontotext.com, presents a multilingual natural language interface to query the retrieval system. In MOLTO, the multilingualism of the queries is addressed by means of the GF Tool, which provides an easy way to build and maintain controlled language grammars for interlingual translation in limited domains. The abstract representation obtained from the GF is used to retrieve both the matched RDF instances and the list of patents semantically related to the user's search criteria. The online interface allows to browse the retrieved patents and shows on the text the semantic annotations that explain the reason why any particular patent has matched the user's criteria.

    This paper describes the patents retrieval prototype developed within the MOLTO project. The prototype aims to provide a multilingual natural language interface for querying the content of patent documents. The developed system is focused on the biomedical and pharmaceutical domain and includes the translation of the patent claims and abstracts into English, French and German. Aiming at the best retrieval results of the patent information and text content, patent documents are preprocessed and semantically annotated. Then, the annotations are stored and indexed in an OWLIM semantic repository, which contains a patent speci c ontology and others from di erent domains. The prototype, accessible online at http://molto-patents. ontotext.com, presents a multilingual natural language interface to query the retrieval system. In MOLTO, the multilingualism of the queries is addressed by means of the GF Tool, which provides an easy way to build and maintain controlled language grammars for interlingual translation in limited domains. The abstract representation obtained from the GF is used to retrieve both the matched RDF instances and the list of patents semantically related to the user's search criteria. The online interface allows to browse the retrieved patents and shows on the text the semantic annotations that explain the reason why any particular patent has matched the user's criteria.

  • Factoid Question Answering for Spoken Documents  Open access

     Comas Umbert, Pere Ramon
    Defense's date: 2012-06-12
    Department of Computer Science, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In this dissertation, we present a factoid question answering system, specifically tailored for Question Answering (QA) on spoken documents. This work explores, for the first time, which techniques can be robustly adapted from the usual QA on written documents to the more difficult spoken documents scenario. More specifically, we study new information retrieval (IR) techniques designed for speech, and utilize several levels of linguistic information for the speech-based QA task. These include named-entity detection with phonetic information, syntactic parsing applied to speech transcripts, and the use of coreference resolution. Our approach is largely based on supervised machine learning techniques, with special focus on the answer extraction step, and makes little use of handcrafted knowledge. Consequently, it should be easily adaptable to other domains and languages. In the work resulting of this Thesis, we have impulsed and coordinated the creation of an evaluation framework for the task of QA on spoken documents. The framework, named QAst, provides multi-lingual corpora, evaluation questions, and answers key. These corpora have been used in the QAst evaluation that was held in the CLEF workshop for the years 2007, 2008 and 2009, thus helping the developing of state-of-the-art techniques for this particular topic. The presentend QA system and all its modules are extensively evaluated on the European Parliament Plenary Sessions English corpus composed of manual transcripts and automatic transcripts obtained by three different Automatic Speech Recognition (ASR) systems that exhibit significantly different word error rates. This data belongs to the CLEF 2009 track for QA on speech transcripts. The main results confirm that syntactic information is very useful for learning to rank question candidates, improving results on both manual and automatic transcripts unless the ASR quality is very low. Overall, the performance of our system is comparable or better than the state-of-the-art on this corpus, confirming the validity of our approach.

    En aquesta Tesi, presentem un sistema de Question Answering (QA) factual, especialment ajustat per treballar amb documents orals. En el desenvolupament explorem, per primera vegada, quines tècniques de les habitualment emprades en QA per documents escrit són suficientment robustes per funcionar en l'escenari més difícil de documents orals. Amb més especificitat, estudiem nous mètodes de Information Retrieval (IR) dissenyats per tractar amb la veu, i utilitzem diversos nivells d'informació linqüística. Entre aquests s'inclouen, a saber: detecció de Named Entities utilitzant informació fonètica, "parsing" sintàctic aplicat a transcripcions de veu, i també l'ús d'un sub-sistema de detecció i resolució de la correferència. La nostra aproximació al problema es recolza en gran part en tècniques supervisades de Machine Learning, estant aquestes enfocades especialment cap a la part d'extracció de la resposta, i fa servir la menor quantitat possible de coneixement creat per humans. En conseqüència, tot el procés de QA pot ser adaptat a altres dominis o altres llengües amb relativa facilitat. Un dels resultats addicionals de la feina darrere d'aquesta Tesis ha estat que hem impulsat i coordinat la creació d'un marc d'avaluació de la taska de QA en documents orals. Aquest marc de treball, anomenat QAst (Question Answering on Speech Transcripts), proporciona un corpus de documents orals multi-lingüe, uns conjunts de preguntes d'avaluació, i les respostes correctes d'aquestes. Aquestes dades han estat utilitzades en les evaluacionis QAst que han tingut lloc en el si de les conferències CLEF en els anys 2007, 2008 i 2009; d'aquesta manera s'ha promogut i ajudat a la creació d'un estat-de-l'art de tècniques adreçades a aquest problema en particular. El sistema de QA que presentem i tots els seus particulars sumbòduls, han estat avaluats extensivament utilitzant el corpus EPPS (transcripcions de les Sessions Plenaries del Parlament Europeu) en anglès, que cónté transcripcions manuals de tots els discursos i també transcripcions automàtiques obtingudes mitjançant tres reconeixedors automàtics de la parla (ASR) diferents. Els reconeixedors tenen característiques i resultats diferents que permetes una avaluació quantitativa i qualitativa de la tasca. Aquestes dades pertanyen a l'avaluació QAst del 2009. Els resultats principals de la nostra feina confirmen que la informació sintàctica és mol útil per aprendre automàticament a valorar la plausibilitat de les respostes candidates, millorant els resultats previs tan en transcripcions manuals com transcripcions automàtiques, descomptat que la qualitat de l'ASR sigui molt baixa. En general, el rendiment del nostre sistema és comparable o millor que els altres sistemes pertanyents a l'estat-del'art, confirmant així la validesa de la nostra aproximació.

  • Access to the full text
    The UPC submission to the WMT 2012 shared task on quality estimation  Open access

     Pighin, Daniele; Gonzalez Bermudez, Meritxell; Màrquez Villodre, Lluís
    Workshop on Statistical Machine Translation
    Presentation's date: 2012-06-07
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In this paper, we describe the UPC system that participated in the WMT 2012 shared task on Quality Estimation for Machine Translation. Based on the empirical evidence that fluencyrelated features have a very high correlation with post-editing effort, we present a set of features for the assessment of quality estimation for machine translation designed around different kinds of n-gram language models, plus another set of features that model the quality of dependency parses automatically projected from source sentences to translations. We document the results obtained on the shared task dataset, obtained by combining the features that we designed with the baseline features provided by the task organizers.

  • Access to the full text
    A hybrid system for patent translation  Open access

     Enache, Ramona; España Bonet, Cristina; Ranta, Aarne; Màrquez Villodre, Lluís
    Conference of the European Association for Machine Translation
    Presentation's date: 2012-05-30
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This work presents a HMT system for patent translation. The system exploits the high coverage of SMT and the high precision of an RBMT system based on GF to deal with specific issues of the language. The translator is specifically developed to translate patents and it is evaluated in the English-French language pair. Although the number of issues tackled by the grammar are not extremely numerous yet, both manual and automatic evaluations consistently show their preference for the hybrid system in front of the two individual translators.

  • MOLTO. Description of the nal collection of corpora

     España Bonet, Cristina; Gonzalez Bermudez, Meritxell; Màrquez Villodre, Lluís
    Date: 2011-09-12
    Report

     Share Reference managers Reference managers Open in new window

  • MOLTO. Patent MT and retrieval prototype beta

     Chechev, Milen; Enache, Ramona; España Bonet, Cristina; Gonzalez Bermudez, Meritxell; Màrquez Villodre, Lluís; Popov, Borislav; Ranta, Aarne
    Date: 2011-12-01
    Report

     Share Reference managers Reference managers Open in new window

  • OpenMT-2 International Workshop on Using Linguistic Information for Hybrid Machine Translation

     Màrquez Villodre, Lluís
    Participation in a competitive project

     Share

  • Access to the full text
    Deep evaluation of hybrid architectures: simple metrics correlated with human judgments  Open access

     Labaka, Gorka; Sarasola, Kepa; Díaz de Ilarraza Sánchez, Arantza; España Bonet, Cristina; Màrquez Villodre, Lluís
    International Workshop on Using Linguistic Information for Hybrid Machine Translation
    Presentation's date: 2011-11-18
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    The process of developing hybrid MT systems is guided by the evaluation method used to compare different combinations of basic subsystems. This work presents a deep evaluation experiment of a hybrid architecture that tries to get the best of both worlds, rule-based and statistical. In a first evaluation human assessments were used to compare just the single statistical system and the hybrid one, the rule-based system was not compared by hand because the results of automatic evaluation showed a clear disadvantage. But a second and wider evaluation experiment surprisingly showed that according to human evaluation the best system was the rule-based, the one that achieved the worst results using automatic evaluation. An examination of sentences with controversial results suggested that linguistic well-formedness in the output should be considered in evaluation. After experimenting with 6 possible metrics we conclude that a simple arithmetic mean of BLEU and BLEU calculated on parts of speech of words is clearly a more human conformant metric than lexical metrics alone.

    Postprint (author’s final draft)

  • Access to the full text
    Hybrid machine translation guided by a rule-based system  Open access

     España Bonet, Cristina; Màrquez Villodre, Lluís; Labaka, Gorka; Sarasola, Kepa; Díaz de Ilarraza Sánchez, Arantza
    Machine Translation Summit
    Presentation's date: 2011-09-22
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This paper presents a machine translation architecture which hybridizes Matxin, a rulebased system, with regular phrase-based Statistical Machine Translation. In short, the hybrid translation process is guided by the rulebased engine and, before transference, a set of partial candidate translations provided by SMT subsystems is used to enrich the treebased representation. The final hybrid translation is created by choosing the most probable combination among the available fragments with a statistical decoder in a monotonic way. We have applied the hybrid model to a pair of distant languages, Spanish and Basque, and according to our evaluation (both automatic and manual) the hybrid approach significantly outperforms the best SMT system on out-of-domain data.

    Postprint (author’s final draft)

  • Access to the full text
    Patent translation within the MOLTO project  Open access

     España Bonet, Cristina; Enache, Ramona; Slaski, Adam; Ranta, Aarne; Màrquez Villodre, Lluís; Gonzalez Bermudez, Meritxell
    Workshop on Patent Translation
    Presentation's date: 2011-09-23
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    MOLTO is an FP7 European project whose goal is to translate texts between multiple languages in real time with high quality. Patents translation is a case of study where research is focused on simultaneously obtaining a large coverage without loosing quality in the translation. This is achieved by hybridising between a grammar-based multilingual translation system, GF, and a specialised statistical machine translation system. Moreover, both individual systems by themselves already represent a step forward in the translation of patents in the biomedical domain, for which the systems have been trained.

  • Linguistic measures for automatic machine translation evaluation

     Gimenez Linares, Jesús Ángel; Màrquez Villodre, Lluís
    Machine translation
    Date of publication: 2010-12
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Assessing the quality of candidate translations involves diverse linguistic facets. However, most automatic evaluation methods in use today rely on limited quality assumptions, such as lexical similarity. This introduces a bias in the development cycle which in some cases has been reported to carry very negative consequences. In order to tackle this methodological problem, we explore a novel path towards heterogeneous automatic Machine Translation evaluation. We have compiled a rich set of specialized similarity measures operating at different linguistic dimensions and analyzed their individual and collective behaviour over a wide range of evaluation scenarios. Results show that measures based on syntactic and semantic information are able to provide more reliable system rankings than lexical measures, especially when the systems under evaluation are based on different paradigms. At the sentence level, while some linguistic measures perform better than most lexical measures, some others perform substantially worse, mainly due to parsing problems. Their scores are, however, suitable for combination, yielding a substantially improved evaluation quality.

  • Feedback Analysis for User adaptive Statistical Translation

     Màrquez Villodre, Lluís; Formiga Fanals, Lluis; Mariño Acebal, Jose Bernardo; Gonzalez Bermudez, Meritxell; Rodríguez Fonollosa, José Adrián; Monte Moreno, Enrique; Barron Cedeño, Luis Alberto
    Participation in a competitive project

     Share

  • TIN2009-14675-C03-03

     Màrquez Villodre, Lluís; Martin Escofet, Carme
    Participation in a competitive project

     Share

  • Feedback Analysis for User adaptive Statistical Translation

     Màrquez Villodre, Lluís
    Participation in a competitive project

     Share

  • TRADUCCION AUTOMATICA HIBRIDA Y EVALUACION AVANZADA (UPC)

     Castell Ariño, Nuria; Farwell, David Loring; Gimenez Linares, Jesús Ángel; Ferrer Cancho, Ramon; Martin Escofet, Carme; Daude Ventura, Jorge; Abad Soriano, Maria Teresa; Gonzalez Bermudez, Meritxell; Màrquez Villodre, Lluís
    Participation in a competitive project

     Share

  • Multilingual On-Line Translation

     Màrquez Villodre, Lluís
    Participation in a competitive project

     Share

  • Multilingual On-Line Translation

     Rodriguez Hontoria, Horacio; Gonzalez Bermudez, Meritxell; España Bonet, Cristina; Farwell, David Loring; Carreras Perez, Xavier; Xambó Descamps, Sebastian; Màrquez Villodre, Lluís; Padró Cirera, Lluís; Saludes Closa, Jordi
    Participation in a competitive project

     Share

  • OPENMT-2: Traducción automática híbrida y evaluación avanzada

     Màrquez Villodre, Lluís
    Participation in a competitive project

     Share

  • Access to the full text
    Robust estimation of feature weights in statistical machine translation  Open access

     España Bonet, Cristina; Màrquez Villodre, Lluís
    Annual Conference of the European Association for Machine Translation
    Presentation's date: 2010
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Weights of the various components in a standard Statistical Machine Translation model are usually estimated via Minimum Error Rate Training. With this, one finds their optimum value on a development set with the expectation that these optimal weights generalise well to other test sets. However, this is not always the case when domains differ. This work uses a perceptron algorithm to learn more robust weights to be used on out-of-domain corpora without the need for specialised data. For an Arabic-to-English translation system, the generalisation of weights represents an improvement of more than 2 points of BLEU with respect to the MERT baseline using the same information.

  • Improving semantic role classification with selectional preferences

     Zapirain, Beñat; Agirre, Eneko; Màrquez Villodre, Lluís; Surdeanu, Mihai
    Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics
    Presentation's date: 2010
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Access to the full text
    Using dependency parsing and machine learning for factoid question answering on spoken documents  Open access

     Comas Umbert, Pere Ramon; Turmo Borras, Jorge; Màrquez Villodre, Lluís
    Annual Conference of the International Speech Communication Association
    Presentation's date: 2010-09-29
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This paper presents our experiments in question answering for speech corpora. These experiments focus on improving the answer extraction step of the QA process. We present two approaches to answer extraction in question answering for speech corpora that apply machine learning to improve the coverage and precision of the extraction. The first one is a reranker that uses only lexical information, the second one uses dependency parsing to score robust similarity between syntactic structures. Our experimental results show that the proposed learning models improve our previous results using only hand-made ranking rules with small syntactic information. Moreover, this results show also that a dependency parser can be useful for speech transcripts even if it was trained with written text data from a news collection. We evaluate the system on manual transcripts of speech from EPPS English corpus and a set of questions transcribed from spontaneous oral questions. This data belongs to the CLEF 2009 track on QA on speech transcripts (QAst).

    Postprint (author’s final draft)

  • Discriminative Phrase-Based Models for Arabic Machine Translation

     España Bonet, Cristina; Gimenez Linares, Jesús Ángel; Màrquez Villodre, Lluís
    ACM transactions on asian language information processing
    Date of publication: 2009
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • ORGANIZACIÓN DEL 13º CONGRESO ANUAL DE LA ASOCIACIÓN EUROPEA PARA LA TRADUCCIÓNAUTOMATICA, EAMT-2009

     Farwell, David Loring; Mariño Acebal, Jose Bernardo; Màrquez Villodre, Lluís; Rodríguez Fonollosa, José Adrián
    Participation in a competitive project

     Share

  • Enriching Statistical Translation Models Using a Domain-Independent Multilingual Lexical Knowledge Base

     Garcia, M; Gimenez Linares, Jesús Ángel; Màrquez Villodre, Lluís
    Lecture notes in computer science
    Date of publication: 2009-01
    Journal article

     Share Reference managers Reference managers Open in new window

  • Computational semantic analysis of language: SemEval-2007 and beyond

     Agirre, E; Màrquez Villodre, Lluís; Wicentowski, R
    Language resources and evaluation
    Date of publication: 2009-06
    Journal article

     Share Reference managers Reference managers Open in new window