Graphic summary
  • Show / hide key
  • Information


Scientific and technological production
  •  

1 to 50 of 295 results
  • Access to the full text
    Building synthetic voices in the METANET framework  Open access

     Garcia Casademont, Emília; Bonafonte Cavez, Antonio Jesus; Moreno Bilbao, M. Asuncion
    International Conference on Language Resources and Evaluation
    Presentation's date: 2012-05-25
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    METANET4U is a European project aiming at supporting language technology for European languages and multilingualism. It is a project in the META-NET Network of Excellence, a cluster of projects aiming at fostering the mission of META, which is the Multilingual Europe Technology Alliance, dedicated to building the technological foundations of a multilingual European information society. This paper describe the resources produced at our lab to provide Synthethic voices. Using existing 10h corpus for a male and a female Spanish speakers, voices have been developed to be used in Festival, both with unit-selection and with statistical-based technologies. Furthermore, using data produced for supporting research on intra and inter-lingual voice conversion, four bilingual voices (English/Spanish) have been developed. The paper describes these resources which are available through META. Furthermore, an evaluation is presented to compare different synthesis techniques, influence of amount of data in statistical speech synthesis and the effect of sharing data in bilingual voices.

  • Search engine for multilingual audiovisual contents

     Pérez, José David; Bonafonte Cavez, Antonio Jesus; Cardenal, Antonio; Ruiz Costajussà, Marta; Rodríguez Fonollosa, José Adrián; Moreno Bilbao, M. Asuncion; Navas, Eva; Rodríguez Banga, Eduardo
    Jornadas en Tecnología del Habla and III Iberian SLTech Workshop
    Presentation's date: 2012-11
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • META-NET. Official languages of Spain in the digital age

     Moreno Bilbao, M. Asuncion; Bel, Nùria; Melero, Maite; García-Mateo, Carmen; Hernáez, Inma; Oller Moreno, Sergio; Burchardt, Aljoscha; Eichler, Kathrin; Rehm, Georg; Uszkoreit, Hans
    Jornadas en Tecnología del Habla and III Iberian SLTech Workshop
    Presentation's date: 2012
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Building synthetic voices in the META-NET framework

     Garcia Casademont, Emília; Bonafonte Cavez, Antonio Jesus; Moreno Bilbao, M. Asuncion
    International Conference on Language Resources and Evaluation
    Presentation's date: 2012-05
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    METANET 4 U is a European project aiming at supporting language technology for European languages and multilingualism. It is a project in the META-NET Network of Excellence, a cluster of projects aiming at fostering the mission of META, which is the Multilingual Europe Technology Alliance, dedicated to building the technological foundations of a multilingual European information society. This paper describe the resources produced at our lab to provide Synthethic voices. Using existing 10h corpus for a male and a female Spanish speakers, voices have been developed to be used in Festival, both with unit-selection and with statistical-based technologies. Furthermore, using data produced for supporting research on intra and inter-lingual voice conversion, four bilingual voices (English/Spanish) have been developed. The paper describes these resources which are available through META. Furthermore, an evaluation is presented to compare different synthesis techniques, influence of amount of data in statistical speech synthesis and the effect of sharing data in bilingual voices

  • Access to the full text
    The BUCEADOR multi-language search engine for digital libraries  Open access

     Adell, Jordi; Bonafonte Cavez, Antonio Jesus; Cardenal, Antonio; Ruiz Costajussà, Marta; Rodríguez Fonollosa, José Adrián; Moreno Bilbao, M. Asuncion; Navas, Eva; Rodríguez Banga, Eduardo
    International Conference on Language Resources and Evaluation
    Presentation's date: 2012-05-24
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This paper presents a web-based multimedia search engine built within the Buceador (www.buceador.org) research project. A proof-of-concept tool has been implemented which is able to retrieve information from a digital library made of multimedia documents in the 4 official languages in Spain (Spanish, Basque, Catalan and Galician). The retrieved documents are presented in the user language after translation and dubbing (the four previous languages + English). The paper presents the tool functionality, the architecture, the digital library and provide some information about the technology involved in the fields of automatic speech recognition, statistical machine translation, text-to-speech synthesis and information retrieval. Each technology has been adapted to the purposes of the presented tool as well as to interact with the rest of the technologies involved.

  • Access to the full text
    METANET4U: aumentar la infraestructura lingüística europea  Open access

     Bel, N.; Moreno Bilbao, M. Asuncion
    Procesamiento del lenguaje natural
    Date of publication: 2012
    Journal article

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    El proyecto METANET4U está contribuyendo a la creación de una plataforma digital pan-europea que sustentará la distribución y el intercambio de recursos y servicios lingüísticos con el objetivo último de apoyar el desarrollo de aplicaciones basadas en tecnologías lingüísticas.

  • Síntesis de voz aplicada a la traducción voz a voz  Open access

     Agüero, Pablo Daniel
    Defense's date: 2012-10-23
    Department of Signal Theory and Communications, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In the field of speech technologies, text-to-speech conversion is the automatic generation of artificial voices that sound identical to a human voice when reading a text in loud speech. Inside a text-to-speech system, the prosody module produces the prosodic information that is necessary to generate a natural voice: intonational phrases, intonation of the sentence, duration and energy of phonemes, etc. The correct generation of this information directly impacts in the naturalness and expressiveness of the system. The main goals of this thesis is the development of new algorithms to train models for prosody generation that may be used in a text-to-speech system, and their use in the framework of speech-to-speech translation. In this thesis several alternatives were studied for intonation modeling. They combine the parameterization and the intonation model generation as a integrated process. Such approach was successfully judged both with objective and subjective evaluations. The influence of segmental and suprasegmental factors in duration modeling was also studied. Several algorithms were proposed with the results of these studies that may combine segmental and suprasegmental information, likewise other publications of this field. Finally, an analysis of various phrase break models was also performed, both with words and accent groups: classification trees (CART), language modeling (LM) and finite state transducers (FST). The use of the same data set in the experiments was useful to obtain relevant conclusions about the differences between these models. One of the main goals of this thesis was the improvement of naturalness, expressiveness and consistency with the style of the source speaker in text-to-speech systems. This may be done by using the prosody of the source speaker in the framework of speech-to-speech translation as an additional information source. Several algorithms were developed for prosody generation that may integrate such additional information for the prediction of intonation, phoneme duration and phrase breaks. In that direction several approaches were studied to transfer the intonation from one language to the other. The chosen approach was an automatic clustering algorithm that finds a certain number of tonal movements that are related between languages, without any limitation about their number. In this way, it is possible to use this coding for intonation modeling of the target language. Experimental results show an improvement, that is more relevant in close languages, such as Spanish and Catalan. Although no segmental duration transfer was performed between languages, in this thesis is proposed the transfer of rhythm from one language to the other. For that purpose a method that combines the rhythm transfer and audio synchronization was proposed. The synchronizations is included because of its importance for the speech-to-speech translation technology when video is also used. Lastly, in this thesis was also proposed a pause transfer technique in the framework of speech-to-speech translation, by means of alignment information. Studies in training data have shown the advantage of tuples for this task. In order to predict any pause that can not be transferred using the before mentioned method, conventional pause prediction algorithms are used (CART, CART+LM, FST), taking into account the already transferred pauses.

    Dentro de las tecnologías del habla, la conversión texto a voz consiste en la generación, por medios automáticos, de una voz artificial que genera idéntico sonido al producido por una persona al leer un texto en voz alta. En resumen, los conversores texto a voz son sistemas que permiten la conversión de textos en voz sintética. El proceso de conversión texto a voz se divide en tres módulos básicos: procesamiento del texto, generación de la prosodia y generación de la voz sintética. En el primero de los módulos se realiza la normalización del texto (para expandir abreviaciones, convertir números y fechas en texto, etc), y en ocasiones, luego también se hace un etiquetado morfosintáctico. A continuación se procede a la conversión de los grafemas en fonemas y a la silabificación para obtener la secuencia de fonemas necesaria para reproducir el texto. Posteriormente, el módulo de prosodia genera la información prosódica para poder producir la voz. Para ello se predicen las frases entonativas y la entonación de la oración, y también la duración y la energía de los fonemas, etc. La correcta generación de esta información repercutirá directamente en la naturalidad y expresividad del sistema. En el último modulo de generación de la voz es donde se produce la voz considerando la información provista por los módulos de procesamiento del texto y prosodia. El objetivo de la presente tesis es el desarrollo de nuevos algoritmos para el entrenamiento de modelos de generación de prosodia para la conversión texto a voz, y su aplicación en el marco de la traducción voz a voz. En el caso de los algoritmos de modelado de entonación, en la literatura se proponen generalmente enfoques que incluyen una estilización previa a la parametrización. En esta tesis se estudiaron alternativas para evitar esa estilización, combinando la parametrización y la generación del modelo de entonación en un todo integrado. Dicho enfoque ha resultado exitoso tanto en la evaluación objetiva (usando medidas como el error cuadrático medio o el coeficiente de correlación Pearson) como en la subjetiva. Los evaluadores han considerado que el enfoque propuesto tiene una calidad y una naturalidad superiores a otros algoritmos existentes en la literatura incluidos en las evaluaciones, alcanzando un MOS de naturalidad de 3,55 (4,63 para la voz original) y un MOS de calidad de 3,78 (4,78 para la voz original).

  • La lengua española en la era digital : The Spanish language in the digital age

     Melero, Maite; Badía, Toni; Moreno Bilbao, M. Asuncion
    Date of publication: 2012
    Book

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • La llengua catalana a l'era digital : The Catalan language in the digital age

     Moreno Bilbao, M. Asuncion; Bel, Nùria; García, Emília; Vallverdu Bayes, Francisco; Revilla, Eva
    Date of publication: 2012
    Book

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • A multilingual corpus for rich audio-visual scene description in a meeting-room environment

     Butko, Taras; Nadeu Camprubí, Climent; Moreno Bilbao, M. Asuncion
    ICMI Workshop on Multimodal Corpora For Machine Learning
    Presentation's date: 2011-11-18
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    In this paper, we present a multilingual database specifically designed to develop technologies for rich audio-visual scene description in meeting-room environments. Part of that database includes the already existing CHIL audio-visual recordings, whose annotations have been extended. A relevant objective in the new recorded sessions was to include situations in which the semantic content can not be extracted from a single modality. The presented database, that includes five hours of rather spontaneously generated scientific presentations, was manually annotated using standard or previously reported annotation schemes, and will be publicly available for the research purposes.

    Postprint (author’s final draft)

  • FEATURE SELECTION FOR MULTIMODAL ACOUSTIC EVENT DETECTION  Open access

     Butko, Taras
    Defense's date: 2011-07-08
    Department of Signal Theory and Communications, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

     The detection of the Acoustic Events (AEs) naturally produced in a meeting room may help to describe the human and social activity. The automatic description of interactions between humans and environment can be useful for providing: implicit assistance to the people inside the room, context-aware and content-aware information requiring a minimum of human attention or interruptions, support for high-level analysis of the underlying acoustic scene, etc. On the other hand, the recent fast growth of available audio or audiovisual content strongly demands tools for analyzing, indexing, searching and retrieving the available documents. Given an audio document, the first processing step usually is audio segmentation (AS), i.e. the partitioning of the input audio stream into acoustically homogeneous regions which are labelled according to a predefined broad set of classes like speech, music, noise, etc. Acoustic event detection (AED) is the objective of this thesis work. A variety of features coming not only from audio but also from the video modality is proposed to deal with that detection problem in meeting-room and broadcast news domains. Two basic detection approaches are investigated in this work: a joint segmentation and classification using Hidden Markov Models (HMMs) with Gaussian Mixture Densities (GMMs), and a detection-by-classification approach using discriminative Support Vector Machines (SVMs). For the first case, a fast one-pass-training feature selection algorithm is developed in this thesis to select, for each AE class, the subset of multimodal features that shows the best detection rate. AED in meeting-room environments aims at processing the signals collected by distant microphones and video cameras in order to obtain the temporal sequence of (possibly overlapped) AEs that have been produced in the room. When applied to interactive seminars with a certain degree of spontaneity, the detection of acoustic events from only the audio modality alone shows a large amount of errors, which is mostly due to the temporal overlaps of sounds. This thesis includes several novelties regarding the task of multimodal AED. Firstly, the use of video features. Since in the video modality the acoustic sources do not overlap (except for occlusions), the proposed features improve AED in such rather spontaneous scenario recordings. Secondly, the inclusion of acoustic localization features, which, in combination with the usual spectro-temporal audio features, yield a further improvement in recognition rate. Thirdly, the comparison of feature-level and decision-level fusion strategies for the combination of audio and video modalities. In the later case, the system output scores are combined using two statistical approaches: weighted arithmetical mean and fuzzy integral. On the other hand, due to the scarcity of annotated multimodal data, and, in particular, of data with temporal sound overlaps, a new multimodal database with a rich variety of meeting-room AEs has been recorded and manually annotated, and it has been made publicly available for research purposes.

  • Enhancing the european Linguistic Infraestructure

     Bonafonte Cavez, Antonio Jesus; Nadeu Camprubí, Climent; Vallverdu Bayes, Francisco; Butko, Taras; Rodríguez Fonollosa, José Adrián; Wolf, Martin; Moreno Bilbao, M. Asuncion
    Participation in a competitive project

     Share

  • Ogmios: the UPC entry for the Albayzin 2010 TTS Evaluation

     Bonafonte Cavez, Antonio Jesus; Esquerra Llucià, Ignasi; Moreno Bilbao, M. Asuncion; Agüero, Pablo Daniel
    Jornadas en Tecnología del Habla and Iberian SLTech Workshop
    Presentation's date: 2010-11-10
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Synthesis using speaker adaptation from speech recognition DB

     Oller, Sergio; Moreno Bilbao, M. Asuncion; Bonafonte Cavez, Antonio Jesus
    Jornadas en Tecnología del Habla and Iberian SLTech Workshop
    Presentation's date: 2010-11
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • La evaluación de competencias en los trabajos Fin de Estudios

     Valderrama, Elena; Rullán, Mercedes; Sanchez Carracedo, Fermin; Pons, Jordi; Mans, Claudi; Giné, Francesc; Vilanova, Ramon; Seco, Gonzalo; Jiménez, Laureano; Peig, Enric; Carrera, Julián; Moreno Bilbao, M. Asuncion; Garcia Almiñana, Jordi; Pérez, Julio; Cores, Fernando; Renau, Josep Maria; Tejero, Javier; Bisbal Riera, Jesús
    IEEE - RITA (Revista iberoamericana de tecnologías del aprendizaje)
    Date of publication: 2010-08
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Voice Conversion Based on Weighted Frequency Warping

     Erro Eslava, Daniel; Moreno Bilbao, M. Asuncion; Bonafonte Cavez, Antonio Jesus
    IEEE transactions on audio speech and language processing
    Date of publication: 2010-07
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • BUSQUEDA DE INFORMACIÓN EN CONTENIDOS AUDIOVISUALES PLURILINGUES

     Esquerra Llucià, Ignasi; Monte Moreno, Enrique; Rodríguez Fonollosa, José Adrián; Bonafonte Cavez, Antonio Jesus; Polyakova, Tatyana; Mariño Acebal, Jose Bernardo; Ruiz Costa-jussa, Marta; Adell Mercado, Jordi; Moreno Bilbao, M. Asuncion
    Participation in a competitive project

     Share

  • INCA Algorithm for Training Voice Conversion Systems From Nonparallel Corpora

     Erro Eslava, Daniel; Moreno Bilbao, M. Asuncion; Bonafonte Cavez, Antonio Jesus
    IEEE transactions on audio speech and language processing
    Date of publication: 2010-07
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Prosodic analysis and modelling of conversational elements for speech synthesis

     Adell Mercado, Jordi
    Defense's date: 2009-07-22
    Department of Signal Theory and Communications, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Access to the full text
    Recent work on the FESTCAT database for speech synthesis  Open access

     Bonafonte Cavez, Antonio Jesus; Esquerra Llucià, Ignasi; Aguilar, Lourdes; Oller, Sergio; Moreno Bilbao, M. Asuncion
    Iberian SLTech
    Presentation's date: 2009-09-04
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This paper presents our work around the FESTCAT project, whose main goal was the development of voices for the Festival suite in Catalan. In the first year, we produced the corpus and the speech data needed for build 10 voices using the Clunits (unit selection) and the HTS (Markov models) methods. The resulting voices are freely available on the web page of the project and included in Linkat, a Catalan distribution of Linux. More recently, we have updated the voices using new versions of HTS, other technology (Multisyn) and we have produced a child voice. Furthermore, we have performed a prosodic labeling and analysis of the database using the break index labels proposed in the ToBI system aimed to improve the intonation of the synthetic speech.

  • RECONOCIMIENTO AUTOMÁTICO DEL HABLA PARA LOS DIALECTOS DEL ESPAÑOL.

     Caballero Galeote, Monica
    Defense's date: 2009-02-18
    Department of Signal Theory and Communications, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • VEU: GRUP DE TRACTAMENT DE LA PARLA

     Bonafonte Cavez, Antonio Jesus; Casar Lopez, Marta; Ruiz Costa-jussa, Marta; Nogueiras Rodriguez, Albino; Esquerra Llucià, Ignasi; Salavedra Moli, Josep; Farrús Cabecerán, Mireia; Hernando Pericas, Francisco Javier; Rodríguez Fonollosa, José Adrián; Monte Moreno, Enrique; Mariño Acebal, Jose Bernardo; Nadeu Camprubí, Climent; Moreno Bilbao, M. Asuncion; Vallverdu Bayes, Francisco
    Participation in a competitive project

     Share

  • Multidialectal Spanish acoustic modeling for speech recognition

     Caballero Galeote, Monica; Moreno Bilbao, M. Asuncion; Nogueiras Rodriguez, Albino
    Speech communication
    Date of publication: 2009-03
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Flexible harmonic/stochasticmodeling for HMM-based speech synthesis

     Eleftherios, Banos; Daniel, Erro; Bonafonte Cavez, Antonio Jesus; Moreno Bilbao, M. Asuncion
    Jornadas en Tecnología del Habla
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • INTRA-LINGUAL AND CROSS-LINGUAL VOICE CONVERSION USING HARMONIC PLUS STOCHASTIC MODELS

     Erro Eslava, Daniel
    Defense's date: 2008-06-16
    Department of Signal Theory and Communications, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Corpus and Voices for Catalan Speech Synthesis

     Bonafonte Cavez, Antonio Jesus; Adell Mercado, Jordi; Esquerra Llucià, Ignasi; Gallego, Silvia; Moreno Bilbao, M. Asuncion; Pérez Mayos, Javier
    Language Resources and Evaluation Conference
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Si us plau, cal informar el títol

     Moreno Bilbao, M. Asuncion
    Participation in a competitive project

     Share

  • 4.1.1 Descripción de las Técnicas Desarrolladas

     Bonafonte Cavez, Antonio Jesus; Hernando Pericas, Francisco Javier; Mariño Acebal, Jose Bernardo; Moreno Bilbao, M. Asuncion; Nadeu Camprubí, Climent
    Date: 2008-09
    Report

     Share Reference managers Reference managers Open in new window

  • The UPC TTS system description for the 2008 blizzard challenge

     Bonafonte Cavez, Antonio Jesus; Moreno Bilbao, M. Asuncion; Adell Mercado, Jordi; Agüero, Pablo D; Eleftherios, Banos; Daniel, Erro; Esquerra Llucià, Ignasi; Javier, Pérez; Polyakova, Tatyana
    Blizzard Challenge 2008
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • MULTILINGUAL AND CROSSLINGUAL ACOUSTIC MODELLING FOR AUTOMATIC SPEECH RECOGNITION

     Diehl, Frank
    Defense's date: 2007-05-18
    Department of Signal Theory and Communications, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • The UPC TTS system description for the 2007 Blizzard Challenge

     Bonafonte Cavez, Antonio Jesus; Adell Mercado, Jordi; Esquerra Llucià, Ignasi; Moreno Bilbao, M. Asuncion
    Sixth ISCA Tutorial and Research Workshop on Speech Synthesis (SSW6)
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Constraint induction of phonetic-acoustic decision trees for crosslingual acoustic modelling

     Frank, Diehl; Moreno Bilbao, M. Asuncion; Enric, Monte; Monte Moreno, Enrique
    IEEE International Conference on Acoustics, Speech, and Signal Processing
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Crosslingual acoustic model development for automatic speech

     Frank, Diehl; Moreno Bilbao, M. Asuncion; Enric, Monte; Monte Moreno, Enrique
    2007 IEEE Workshop on Automatic Speech Recognition and Understanding
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Flexible harmonic/stochastic speech synthesis

     Daniel, Erro; Moreno Bilbao, M. Asuncion; Bonafonte Cavez, Antonio Jesus
    Sixth ISCA Tutorial and Research Workshop on Speech Synthesis (SSW6)
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Multidialectal acoustic modeling: a comparative study

     Caballero, Mónica; Moreno Bilbao, M. Asuncion; Nogueiras Rodriguez, Albino
    ISCA Workshop on Multilingual Speech and Language Processing
    Presentation's date: 2006-04
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    In this paper, multidialectal acoustic modeling based on shar- ing data across dialects is addressed. A comparative study of different methods of combining data based on decision tree clustering algorithms is presented. Approaches evolved differ in the way of evaluating the similarity of sounds between di- alects, and the decision tree structure applied. Proposed systems are tested with Spanish dialects across Spain and Latin Amer- ica. All multidialectal proposed systems improve monodialectal performance using data from another dialect but it is shown that the way to share data is critical. The best combination between similarity measure and tree structure achieves an improvement of 7% over the results obtained with monodialectal systems.

    In this paper, multidialectal acoustic modeling based on shar- ing data across dialects is addressed. A comparative study of different methods of combining data based on decision tree clustering algorithms is presented. Approaches evolved differ in the way of evaluating the similarity of sounds between di- alects, and the decision tree structure applied. Proposed systems are tested with Spanish dialects across Spain and Latin Amer- ica. All multidialectal proposed systems improve monodialectal performance using data from another dialect but it is shown that the way to share data is critical. The best combination between similarity measure and tree structure achieves an improvement of 7% over the results obtained with monodialectal systems.

  • Voice Conversion of Non-Aligned Data Using Unit Selection

     Duxans, H; Erro, D; Pérez, J; Diego, F; Bonafonte Cavez, Antonio Jesus; Moreno Bilbao, M. Asuncion
    TC-STAR workshop on Speech-to-Speech Tanslation
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Ogmios: The UPC Text-to-Speech Synthesis System for Spoken Translation

     Adell Mercado, Jordi; Bonafonte Cavez, Antonio Jesus; Agüero, P D; Pérez, J; Moreno Bilbao, M. Asuncion
    TC-STAR workshop on Speech-to-Speech Tanslation
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Spanish Synthesis Corpora

     Umbert, M; Moreno Bilbao, M. Asuncion; Agüero, P; Bonafonte Cavez, Antonio Jesus
    International Conference on Language Resources and Evaluation
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Extended use of Local Codebook Features for Continuous Density Hidden Markov Models

     Diehl, Frank; Moreno Bilbao, M. Asuncion; Andrei, Zgank; Zdravko, Kacic
    Advances in Speech Technology
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Continuous Local Codebook Features for multi- and cross-lingual Acoustic Phonetic Modelling

     Diehl, Frank; Moreno Bilbao, M. Asuncion; Enric, Monte; Monte Moreno, Enrique
    9th European Conference on Speech Communication and Technology
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Continuous Local Codebook Features for multi- and cross-lingual Acoustic Phonetic Modelling

     Moreno Bilbao, M. Asuncion
    9th European Conference on Speech Communication and Technology
    Presentation's date: 2005-11-07
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Crosslingual adaptation of semi-continuous HMMs using acoustic regression classes and sub-simplex projection

     Diehl, Frank; Moreno Bilbao, M. Asuncion; Enric, Monte; Monte Moreno, Enrique
    COST278 and ISCA Workshop on Applied Spoken Language Interaction in Distributed Environments
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Orientel - Telephony Databases Across Northen Africa and Middle East

     Moreno Bilbao, M. Asuncion
    International Conference on Language Resources and Evaluation
    Presentation's date: 2005-05-26
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • A Pitch-Asynchronous Simple Method for Speech Synthesis by Diphone Concatenation using the Deterministic plus Stochastic Model

     Erro Eslava, Daniel; Moreno Bilbao, M. Asuncion
    10th International Conference on Speech and Computer
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Medalla Narcís Monturiol 2004

     Moreno Bilbao, M. Asuncion
    Award or recognition

     Share

  • International journal of speech technology

     Moreno Bilbao, M. Asuncion
    Collaboration in journals

     Share

  • Pattern recognition letters

     Moreno Bilbao, M. Asuncion
    Collaboration in journals

     Share

  • Quasi-Continuous Local Codebook Features for multilingual Acoustic Phonetic Modelling

     Moreno Bilbao, M. Asuncion
    IEEE International Conference on Acoustics, Speech, and Signal Processing
    Presentation's date: 2005-03-19
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window