Fuentes Fort, Maria
Total activity: 27
Department
Department of Computer Science
E-mail
mfuentescs.upc.edu
Contact details
UPC directory Open in new window

Graphic summary
  • Show / hide key
  • Information


Scientific and technological production
  •  

1 to 27 of 27 results
  • Access to the full text
    Deliverable 6.1 Infrastructure for Extractive Summarization  Open access

     Saggion, Horacio; Padró Cirera, Lluís; Fuentes Fort, Maria
    Date: 2013-12-30
    Report

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This report described the summarization infrastructure for extractive summarization to be used in SKATER. The summarization components can be applied to the output of FreeLing making it appropriate for integration in SKATER and for the development of different summarization solutions for the languages addressed by the project. The infrastructure is easily extensible to accomodate new relevance computation features including those relying on deeper linguistic processing (e.g. semantic interpretation, word sense disambiguation).

    SKATER Internal Report: software of infrastructure for extractive Summarization (work carried out until December 2013)

  • TIN2012-38584-C06-01 - Adquisición de escenarios de conocimiento a través de la lectura de textos: inferencia de relaciones entre eventos (SKATeR)

     Rodriguez Hontoria, Horacio; Abad Soriano, Maria Teresa; Ageno Pulido, Alicia; Catala Roig, Neus; Comas Umbert, Pere Ramon; Farreres De La Morena, Javier; Fuentes Fort, Maria; Gatius Vila, Marta; Mehdizadeh Naderi, Ali; Padró Cirera, Lluís; Turmo Borras, Jorge
    Competitive project

     Share

  • Access to the full text
    UPC-CORE : What can machine translation evaluation metrics and Wikipedia do for estimating semantic textual similarity?  Open access

     Barron Cedeño, Luis Alberto; Màrquez Villodre, Lluís; Fuentes Fort, Maria; Rodriguez Hontoria, Horacio; Turmo Borras, Jorge
    Joint Conference on Lexical and Computational Semantics
    p. 1-5
    Presentation's date: 2013-06-13
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of metrics borrowed from automatic machine translation, originally intended to evaluate automatic against reference translations and (ii) an instance of explicit semantic analysis, built upon opening paragraphs of Wikipedia 2010 articles. Our similarity estimator relies on a support vector regressor with RBF kernel. Our best approach required 13 machine translation metrics + explicit semantic analysis and ranked 65 in the competition. Our postcompetition analysis shows that the features have a good expression level, but overfitting and ¿mainly¿ normalization issues caused our correlation values to decrease.

    In this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of metrics borrowed from automatic machine translation, originally intended to evaluate automatic against reference translations and (ii) an instance of explicit semantic analysis, built upon opening paragraphs of Wikipedia 2010 articles. Our similarity estimator relies on a support vector regressor with RBF kernel. Our best approach required 13 machine translation metrics + explicit semantic analysis and ranked 65 in the competition. Our postcompetition analysis shows that the features have a good expression level, but overfitting and —mainly— normalization issues caused our correlation values to decrease.

  • Access to the full text
    Spell-checking in Spanish: the case of diacritic accents  Open access

     Atserias, Jordi; Fuentes Fort, Maria; Nazar, Rogelio; Renau, Irene
    International Conference on Language Resources and Evaluation
    p. 1-6
    Presentation's date: 2012-05-24
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This article presents the problem of diacritic restoration (or diacritization) in the context of spell-checking, with the focus on an orthographically rich language such as Spanish. We argue that despite the large volume of work published on the topic of diacritization, currently available spell-checking tools have still not found a proper solution to the problem in those cases where both forms of a word are listed in the checker’s dictionary. This is the case, for instance, when a word form exists with and without diacritics, such as continuo ‘continuous’ and continuó ‘he/she/it continued’, or when different diacritics make other word distinctions, as in continúo ‘I continue’. We propose a very simple solution based on a word bigram model derived from correctly typed Spanish texts and evaluate the ability of this model to restore diacritics in artificial as well as real errors. The case of diacritics is only meant to be an example of the possible applications for this idea, yet we believe that the same method could be applied to other kinds of orthographic or even grammatical errors. Moreover, given that no explicit linguistic knowledge is required, the proposed model can be used with other languages provided that a large normative corpus is available.

    Postprint (author’s final draft)

  • Access to the full text
    Summarizing a multimodal set of documents in a smart room  Open access

     Fuentes Fort, Maria; Rodriguez Hontoria, Horacio; Turmo Borras, Jorge
    International Conference on Language Resources and Evaluation
    p. 1-6
    Presentation's date: 2012-05-23
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This article reports an intrinsic automatic summarization evaluation in the scientific lecture domain. The lecture takes place in a Smart Room that has access to different types of documents produced from different media. An evaluation framework is presented to analyze the performance of systems producing summaries answering a user need. Several ROUGE metrics are used and a manual content responsiveness evaluation was carried out in order to analyze the performance of the evaluated approaches. Various multilingual summarization approaches are analyzed showing that the use of different types of documents outperforms the use of transcripts. In fact, not using any part of the spontaneous speech transcription in the summary improves the performance of automatic summaries. Moreover, the use of semantic information represented in the different textual documents coming from different media helps to improve summary quality.

    Postprint (author’s final draft)

  • Access to the full text
    ALICE: Acquisition of Language through an Interactive Comprehension Environment  Open access

     Fuentes Fort, Maria; Gonzalez Bermudez, Meritxell
    Procesamiento del lenguaje natural
    num. 47, p. 331-332
    Date of publication: 2011-09-05
    Journal article

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Integration of several state-of-the-art technologies related to spoken language and natural language processing used in Intelligent Computer Assisted Language Learning (ICALL) systems. We envision to show that the technology has a level of maturity that suggests that the time may be right to use it at high school. // Integración de tecnologías del estado del arte en procesamiento del habla y procesamiento del lenguaje natural aplicadas a los asistentes inteligentes para el aprendizaje de lenguas. El objetivo es mostrar que el nivel de madurez de la tecnología permite que sea aplicada al aprendizaje de segundas lenguas en secundaria.

  • Access to the full text
    Hacia la interacción en lenguaje natural  Open access

     Fuentes Fort, Maria; Gonzalez Bermudez, Meritxell
    CEUR Workshop proceedings
    Vol. 697, num. 74, p. 51-54
    Date of publication: 2011-02-06
    Journal article

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    En éste documento se presenta la investigación que está siendo llevada a cabo en el Grupo de Procesamiento de Lenguaje Natural (GPLN) de la Universidad Politécnica de Cataluña (UPC). En concreto, hemos articulado la presentación de las diferentes líneas de trabajo tomando como referencia su aplicación en un asistente virtual. Creemos que su uso y implantación irá en aumento en los próximos diez años, de ahí la importancia del estado de las tecnologías del lenguaje natural y, aún mas, de los nuevos retos que este tipo de aplicaciones nos plantean.

  • Access to the full text
    Aprendizaje y asistencia virtual en red /Aprenentatge i assistència virtual en xarxa  Open access

     Jofre Roca, Luis; Romeu Robert, Jordi; Vallverdu Bayes, Francisco; Fuentes Fort, Maria; Guardiola Garcia, Marta; Gonzalez Bermudez, Meritxell
    Date of publication: 2011-01
    Book

    Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

  • Aprendizaje y asistencia virtual en red. La prueba Piloto.

     Fuentes Fort, Maria; Gonzalez Bermudez, Meritxell; Guardiola Garcia, Marta; Jofre Roca, Luis; Romeu Robert, Jordi; Vallverdu Bayes, Francisco
    Date: 2011-07-29
    Report

     Share Reference managers Reference managers Open in new window

  • Access to the full text
    Aprendizaje y asistencia virtual en red  Open access

     Fuentes Fort, Maria; Gonzalez Bermudez, Meritxell; Guardiola Garcia, Marta; Jofre Roca, Luis; Romeu Robert, Jordi; Vallverdu Bayes, Francisco
    Date: 2011-01-20
    Report

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Favorecer el uso de las Tecnologías de Información y la Comunicación (TIC) en entornos educativos puede abarcar varios aspectos del proceso de aprendizaje. La incorporación de TICs en las aulas contribuye, por un lado a reducir la brecha digital tanto de profesores como alumnos y por el otro, debiera ayudar a mejorar la metodología educativa. Sin duda las TICs pueden ser de gran ayuda para llegar a alcanzar los objetivos pedagógicos. En concreto, este estudio tenía por objetivo analizar la manera de incorporar tecnologías de la voz y el procesamiento del lenguaje natural para mejorar el proceso de aprendizaje de la lengua. Para identificar cuales son las características de éste proceso, que aspectos se verán modificados con la incorporación de estas tecnologías y como afectaran las modificaciones a cada aspecto del proceso, en Diciembre de 2010 se organizó una jornada que reunió a distintos sectores sociales directamente involucrados con el ámbito a tratar. Como resultado de ésta jornada, en Abril de 2011, se llevará a cabo una prueba piloto en que se incorporará las tecnologías propuestas en el aprendizaje de inglés y ya se ha empezado a definir un proyecto a medio plazo para buscar financiamiento a nivel estatal o europeo.

  • Access to the full text
    English language learning activity using spoken language and intelligent computer-assisted technologies  Open access

     Fuentes Fort, Maria; Gonzalez Bermudez, Meritxell
    Speech and Language Technology in Education
    p. 1-4
    Presentation's date: 2011-09-24
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This paper presents work in progress on language technologies applied to secondary school education. The application presented integrates several state-of-the-art technologies related to spoken language and intelligent computer-assisted language learning. We envision to show that the technology has reached a level of maturity that suggests that the time may be right to use it to second language learning. To achieve this objective, an activity was designed to be tested at several Spanish high schools. The aim was to carry out a proof of concept in real conditions and to obtain feedback from the students through a questionnaire as well as from the teachers by means of an interview. The activity was designed with the collaboration of some of the teachers at the secondary schools.

  • Sistema de recomendación para un uso inclusivo del lenguaje

     Fuentes Fort, Maria; Padró Cirera, Lluís; Padró Cirera, Muntsa; Turmo Borras, Jorge; Carrera, Jordi
    Procesamiento del lenguaje natural
    num. 42, p. 17-24
    Date of publication: 2009-03
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Sistema que procesa un texto escrito en castellano detectando usos del lenguaje no inclusivos. Para cada sintagma nominal sospechoso el sistema propone una serie de alternativas. El sistema permite también la adquisición automática de ejemplos positivos a partir de documentos que hagan un uso inclusivo del lenguaje. Estos ejemplos serán usados, junto a su contexto, en la presentación de sugerencias. Abstract: System to detect exclusive language in spanish documents. For each noun phrase detected as exclusive, several alternative are suggested by the system. Moreover, the system allows the automatic adquisition of positive examples from inclusive documents to be presented within their context as alternatives.

  • A new lexical chain algorithm used for automatic summarization

     González Pellicer, Edgar; Fuentes Fort, Maria
    International Conference of the Catalan Association for Artificial Intelligence
    p. 329-338
    DOI: 10.3233/978-1-60750-061-2-329
    Presentation's date: 2009-10
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • A Flexible Multitask Summarizer for Documents from Different Media, Domain and Language  Open access

     Fuentes Fort, Maria
    Department of Computer Science, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Automatic Summarization is probably crucial with the increase of document generation. Particularly when retrieving, managing and processing information have become decisive tasks. However, one should not expect perfect systems able to substitute human sumaries. The automatic sumarization process strongly depends not only on the characteristics of the documents, but also on user different needs.Thus, several aspects have to be taken into account when designing an information system for summarizing, because, depending on the characteristics of the input documents and the desired results, several techniques can be aplied. In order to suport this process, the final goal of the thesis is to provide a flexible multitask summarizer architecture. This goal is decomposed in three main research purposes. First, to study the process of porting systems to different summarization tasks, processing documents in different lenguages, domains or media with the aim of designing a generic architecture to permit the easy addition of new tasks by reusing existents tools. Second, the developes prototypes for some tasks involving aspects related with the lenguage, the media and the domain of the document or documents to be summarized as well as aspects related with the summary content: generic, novelly summaries, or summaries that give answer to a specific user need. Third, to create an evaluation framework to analyze the performance of several approaches in written news and scientific oral presentation domains, focusing mainly in its intrinsic evaluation.

    El resumen automático probablemente sea crucial en un momento en que la gran cantidad de documentos generados diariamente hace que recuperar, tratar y asimilar la información que contienen se haya convertido en una ardua y a su vez decisiva tarea. A pesar de ello, no podemos esperar que los resúmenes producidos de forma automática vayan a ser capaces de sustituir a los humanos. El proceso de resumen automático no sólo depende de las características propias de los documentos a ser resumidos, sino que es fuertemente dependiente de las necesidades específicas de los usuarios. Por ello, el diseño de un sistema de información para resumen conlleva tener en cuenta varios aspectos. En función de las características de los documentos de entrada y de los resultados deseados es posible aplicar distintas técnicas. Por esta razón surge la necesidad de diseñar una arquitectura flexible que permita la implementación de múltiples tareas de resumen. Este es el objetivo final de la tesis que presento dividido en tres subtemas de investigación. En primer lugar, estudiar el proceso de adaptabilidad de sistemas a diferentes tareas de resumen, como son procesar documentos producidos en diferentes lenguas, dominios y medios (sonido y texto), con la voluntad de diseñar una arquitectura genérica que permita la fácil incorporación de nuevas tareas a través de reutilizar herramientas existentes. En segundo lugar, desarrollar prototipos para distintas tareas, teniendo en cuenta aspectos relacionados con la lengua, el dominio y el medio del documento o conjunto de documentos que requieren ser resumidos, así como aspectos relacionados con el contenido final del resumen: genérico, novedad o resumen que de respuesta a una necesidad especifica. En tercer lugar, crear un marco de evaluación que permita analizar la competencia intrínseca de distintos prototipos al resumir noticias escritas y presentaciones científicas orales.

  • Access to the full text
    FEMsum at DUC 2007  Open access

     Fuentes Fort, Maria; Rodriguez Hontoria, Horacio; Ferrés Domènech, Daniel
    Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics
    p. 1-7
    Presentation's date: 2007-06-26
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This paper describes and analyzes how the FEMsum system deals with DUC 2007 tasks of providing summary-length answers to complex questions, both background and just-the-news summaries. We participated in producing background summaries for the main task with the FEMsum approach that obtained better results in our last year participation. The FEMsum semantic based approach was adapted to deal with the update pilot task with the aim of producing just-the-news summaries.

    Postprint (author’s final draft)

  • Access to the full text
    Support vector machines for query-focused summarization trained and evaluated on pyramid data  Open access

     Fuentes Fort, Maria; Alfonseca, Enrique; Rodriguez Hontoria, Horacio
    Annual Meeting of the Association for Computational Linguistics
    p. 57-60
    Presentation's date: 2007-06-25
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This paper presents the use of Support Vector Machines (SVM) to detect relevant information to be included in a queryfocused summary. Several SVMs are trained using information from pyramids of summary content units. Their performance is compared with the best performing systems in DUC-2005, using both ROUGE and autoPan, an automatic scoring method for pyramid evaluation.

  • TextMess - SAMiT (Intelligent, Interactive and Multilingual Text Mining based on Human Language Technologies)

     Ageno Pulido, Alicia; Turmo Borras, Jorge; Rodriguez Hontoria, Horacio; Catala Roig, Neus; Comas Umbert, Pere Ramon; González Pellicer, Edgar; Fuentes Fort, Maria; Kanaan Izquierdo, Samir; Ferrés Domènech, Daniel; Sapena Masip, Emili
    Competitive project

     Share

  • Access to the full text
    FEMsum at DUC 2006: Semantic-based approach integrated in a flexible eclectic multitask summarizer architecture  Open access

     Fuentes Fort, Maria; Rodriguez Hontoria, Horacio; Turmo Borras, Jorge; Ferrés Domènech, Daniel
    Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics
    p. 1-8
    Presentation's date: 2006-06-08
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In order to face different requirements at TALP Research Center we have built a highly parameterized environment allowing to instantiate specific summarizers for different summarization tasks in different languages. This paper describes and analyzes how our system deals with the DUC 2006 task of providing summary-length answers to complex questions. The given query is used to detect relevant passages. After that, semantic similarities between these relevant sentences are detected and then used as input of an iterative graph-based algorithm to avoid redundancy and obtain a cohesioned text. NIST human evaluations are used to analyze several aspects of our system and a specific analysis for each of the three different kinds of submitted summaries is reported.

    Postprint (author’s final draft)

  • Access to the full text
    TALP-UPC at TREC 2005: Experiments using voting scheme among three heterogeneous QA systems  Open access

     Ferrés, Daniel; Kanaan Izquierdo, Samir; Gonzàlez Pellicer, Edgar; Ageno Pulido, Alicia; Fuentes Fort, Maria; Rodriguez Hontoria, Horacio; Surdeanu, Mihai; Turmo Borras, Jorge
    Text Retrieval Conference
    Presentation's date: 2006
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This paper describes the experiments of the TALP-UPC group for factoid and ’other’ (definitional) questions at TREC 2005 Main Question Answering (QA)task. Our current approach for factoid questions is based on a voting scheme among three QA systems: TALP-QA (our previous QA system), Sibyl (a new QA system developed at DAMA-UPC and TALP-UPC), and Aranea (a web-based data-driven approach). For defitional questions, we used two different systems: the TALP-QA Definitional system and LCSUM (a Summarization-based system). Our results for factoid questions indicate that the voting strategy improves the accuracy from 7.5% to 17.1%. While these numbers are low (due to technical problems in the Answer Extraction phase of TALP-QA system) they indicate that voting is a succesful approach for performance boosting of QA systems. The answer to definitional questions is produced by selecting phrases using set of patterns associated with definitions. Its results are 17.2% of F-score in the best configuration of TALP-QA Definitional system.

  • TALP-UPC at TREC 2005: Experiments Using Voting Scheme Among Three Hetereogeneous QA Systems

     Ferrés Domènech, Daniel; Kanaan Izquierdo, Samir; Domínguez-Sal, D; Dominguez Sala, David; Gonzàlez, E; Ageno Pulido, Alicia; Fuentes Fort, Maria; Rodriguez Hontoria, Horacio; Surdeanu, Mihai; Turmo Borras, Jorge
    XIV TREC Conference
    p. 1-2
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Summarizing Spontaneous Speech Using General Text Properties

     Fuentes Fort, Maria; González Pellicer, Edgar; Rodriguez Hontoria, Horacio; Turmo Borras, Jorge; Alonso, L
    Crossing Barriers in Text Summarization Research Workshop
    p. 10-17
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • QASUM-TALP at DUC 2005 Automatically Evaluated with a Pyramid based Metric

     Rodriguez Hontoria, Horacio; Ferrés Domènech, Daniel; González Pellicer, Edgar; Fuentes Fort, Maria
    Document Understanding Workshop
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • CHIL: Computers in the Human Interaction Loop (departamento de LSI)

     Rodriguez Hontoria, Horacio; Turmo Borras, Jorge; Ageno Pulido, Alicia; Fuentes Fort, Maria; González Pellicer, Edgar; Comas Umbert, Pere Ramon; Surdeanu, Mihai
    Competitive project

     Share

  • Re-using high-quality resources for continued evaluation of automated summarization systems

     Alonso, Laura; Fuentes Fort, Maria; MASSOT, MARC; Rodriguez Hontoria, Horacio
    4th International Conference on Languages Resources and Evaluation (LREC 2004)
    p. 1033-1036
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Resumidor de noticies en català del projecte Hermes

     Fuentes Fort, Maria; González, E; Rodriguez Hontoria, Horacio
    Congrés d'Enginyeria en Llengua Catalana
    p. 102
    Presentation's date: 2004
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Approaches to Text Summarization: Questions and Answers

     Alonso, Laura; Castellón Masalles, Irene; Climent, Salvador; Fuentes Fort, Maria; Padró Cirera, Lluís; Rodriguez Hontoria, Horacio
    Revista iberoamericana de inteligencia artificial
    num. 20, p. 34-52
    Date of publication: 2003-10
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Mixed Approach to Headline Extraction for DUC 2003

     Fuentes Fort, Maria; MASSOT, MARC; Rodriguez Hontoria, Horacio; Alonso, Laura
    Document Understanding Workshop (DUC 2003)
    p. 89-96
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window