Belanche Muñoz, Luis Antonio
Total activity: 137
Research group
SOCO - Soft Computing
Department
Department of Software
School
Barcelona School of Informatics (FIB)
E-mail
belanchelsi.upc.edu
Contact details
UPC directory Open in new window

Graphic summary
  • Show / hide key
  • Information


Scientific and technological production
  •  

1 to 50 of 137 results
  • Averaging of kernel functions

     Belanche Muñoz, Luis Antonio; Tosi, Alessandra
    Neurocomputing
    Date of publication: 2013-07-18
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    In kernel-based machines, the integration of a number of different kernels to build more flexible learning methods is a promising avenue for research. In multiple kernel learning, a compound kernel is build by learning a kernel that is a positively weighted arithmetic mean of several sources. We show in this paper that the only feasible average for kernel learning is precisely the arithmetic average. We investigate general families of averaging processes and how they relate to the development of kernels. Specifically, a number of multivariate and univariate kernels are developed based on the notion of generalized means. These results can be used in more general kernel optimization procedures.

  • Learning in networks of similarity processing neurons

     Belanche Muñoz, Luis Antonio
    Workshop New Challenges in Neural Computation
    Presentation's date: 2013-09
    Presentation of work at congresses

    Read the abstract Read the abstract  Share Reference managers Reference managers Open in new window

    Similarity functions are a very flexible container under which to express knowledge about a problem as well as to capture the meaningful relations in input space. In this paper we describe ongoing research using similarity functions to find more convenient representations for a problem ¿a crucial factor for successful learning¿ such that subsequent processing can be delivered to linear or non-linear modeling methods. The idea is tested in a set of challenging problems, characterized by a mixture of data types and different amounts of missing values. We report a series of experiments testing the idea against two more traditional approaches, one ignoring the knowledge about the dataset and another using this knowledge to pre-process it. The preliminary results demonstrate competitive or better generalization performance than that found in the literature. In addition, there is a considerable enhancement in the interpretability of the obtained models.

  • Developments in kernel design

     Belanche Muñoz, Luis Antonio
    European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
    Presentation's date: 2013-04
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    The aim of this paper is to give a concise overview of kernels, with a special attention to non-standard or heterogeneous data sources (e.g. non-numerical or structured data). A second goal is to discuss the world of possibilities that kernel design opens for the principled analysis of special or new application domains. The reader is referred to some of the excellent survey publications -as [1, 2, 3]- for an in-depth coverage.

  • Handling missing values in kernel methods with application to microbiology data

     Kobayashi, Vladimer; Aluja Banet, Tomas; Belanche Muñoz, Luis Antonio
    European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
    Presentation's date: 2013-04-24
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    We discuss several approaches that make possible for kernel methods to deal with missing values. The first two are extended kernels able to handle missing values without data preprocessing methods. Another two methods are derived from a sophisticated multiple imputation technique involving logistic regression as local model learner. The performance of these approaches is compared using a binary data set that arises typically in microbiology (the microbial source tracking problem). Our results show that the kernel extensions demonstrate competitive performance in comparison with multiple imputation in terms of predictive accuracy. However, these results are achieved with a simpler and deterministic methodology and entail a much lower computational effort.

  • Kernel functions for categorical variables with application to problems in the life sciences

     Belanche Muñoz, Luis Antonio; Villegas, Marco
    Congrés Internacional de l¿Associació Catalana d¿Intel·ligència Artificial
    Presentation's date: 2013-10
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    We introduce a family of positive definite kernels specifically designed for problems described by categorical information. The kernels are based on the comparison of the probability mass function of the variables and have a clear interpretation in terms of similarity computations between the modalities. We report experimental results on two different problems in the life sciences indicating that the proposed approach may markedly outperform standard kernels, so it can be used as a good alternative to other common kernel functions (at least for SVM classification) in order to obtain better accuracy.

  • Effective classification and gene expression profiling for the facioscapulohumeral muscular dystrophy

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio; Silva Colón, Karen Andrea
    PLoS One
    Date of publication: 2013-12-13
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    The Facioscapulohumeral Muscular Dystrophy (FSHD) is an autosomal dominant neuromuscular disorder whose incidence is estimated in about one in 400,000 to one in 20,000. No effective therapeutic strategies are known to halt progression or reverse muscle weakness and atrophy. It is known that the FSHD is caused by modifications located within a D4ZA repeat array in the chromosome 4q, while recent advances have linked these modifications to the DUX4 gene. Unfortunately, the complete mechanisms responsible for the molecular pathogenesis and progressive muscle weakness still remain unknown. Although there are many studies addressing cancer databases from a machine learning perspective, there is no such precedent in the analysis of the FSHD. This study aims to fill this gap by analyzing two specific FSHD databases. A feature selection algorithm is used as the main engine to select genes promoting the highest possible classification capacity. The combination of feature selection and classification aims at obtaining simple models (in terms of very low numbers of genes) capable of good generalization, that may be associated with the disease. We show that the reported method is highly efficient in finding genes to discern between healthy cases (not affected by the FSHD) and FSHD cases, allowing the discovery of very parsimonious models that yield negligible repeated cross-validation error. These models in turn give rise to very simple decision procedures in the form of a decision tree. Current biological evidence regarding these genes shows that they are linked to skeletal muscle processes concerning specific human conditions. © 2013 Gonzalez-Navarro et al.

  • On the Intelligent Management of Sepsis in the Intensive Care Unit  Open access

     Ribas Ripoll, Vicente Jorge
    Defense's date: 2013-01-29
    Department of Software, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    The management of the Intensive Care Unit (ICU) in a hospital has its own, very specific requirements that involve, amongst others, issues of risk-adjusted mortality and average length of stay; nurse turnover and communication with physicians; technical quality of care; the ability to meet patient's family needs; and avoid medical error due rapidly changing circumstances and work overload. In the end, good ICU management should lead to an improvement in patient outcomes. Decision making at the ICU environment is a real-time challenge that works according to very tight guidelines, which relate to often complex and sensitive research ethics issues. Clinicians in this context must act upon as much available information as possible, and could therefore, in general, benefit from at least partially automated computer-based decision support based on qualitative and quantitative information. Those taking executive decisions at ICUs will require methods that are not only reliable, but also, and this is a key issue, readily interpretable. Otherwise, any decision tool, regardless its sophistication and accuracy, risks being rendered useless. This thesis addresses this through the design and development of computer based decision making tools to assist clinicians at the ICU. It focuses on one of the main problems that they must face: the management of the Sepsis pathology. Sepsis is one of the main causes of death for non-coronary ICU patients. Its mortality rate can reach almost up to one out of two patients for septic shock, its most acute manifestation. It is a transversal condition affecting people of all ages. Surprisingly, its definition has only been standardized two decades ago as a systemic inflammatory response syndrome with confirmed infection. The research reported in this document deals with the problem of Sepsis data analysis in general and, more specifically, with the problem of survival prediction for patients affected with Severe Sepsis. The tools at the core of the investigated data analysis procedures stem from the fields of multivariate and algebraic statistics, algebraic geometry, machine learning and computational intelligence. Beyond data analysis itself, the current thesis makes contributions from a clinical point of view, as it provides substantial evidence to the debate about the impact of the preadmission use of statin drugs in the ICU outcome. It also sheds light into the dependence between Septic Shock and Multi Organic Dysfunction Syndrome. Moreover, it defines a latent set of Sepsis descriptors to be used as prognostic factors for the prediction of mortality and achieves an improvement on predictive capability over indicators currently in use.

    La gestió d'una Unitat de Cures Intensives (UCI) hospitalària presenta uns requisits força específics incloent, entre altres, la disminució de la taxa de mortalitat, la durada de l'ingrès, la rotació d'infermeres i la comunicació entre metges amb al finalitad de donar una atenció de qualitat atenent als requisits tant dels malalts com dels familiars. També és força important controlar i minimitzar els error mèdics deguts a canvis sobtats i a la presa ràpida de deicisions assistencials. Al cap i a la fi, la bona gestió de la UCI hauria de resultar en una reducció de la mortalitat i durada d'estada. La presa de decisions en un entorn de crítics suposa un repte de presa de decisions en temps real d'acord a unes guies clíniques molt restrictives i que, pel que fa a la recerca, poden resultar en problemes ètics força sensibles i complexos. Per tant, el personal sanitari que ha de prendre decisions sobre la gestió de malalts crítics no només requereix eines de suport a la decisió que siguin fiables sinó que, a més a més, han de ser interpretables. Altrament qualsevol eina de decisió que no presenti aquests trets no és considerarà d'utilitat clínica. Aquesta tesi doctoral adreça aquests requisits mitjançant el desenvolupament d'eines de suport a la decisió per als intensivistes i es focalitza en un dels principals problemes als que s'han denfrontar: el maneig del malalt sèptic. La Sèpsia és una de les principals causes de mortalitats a les UCIS no-coronàries i la seva taxa de mortalitat pot arribar fins a la meitat dels malalts amb xoc sèptic, la seva manifestació més severa. La Sèpsia és un síndrome transversal, que afecta a persones de totes les edats. Sorprenentment, la seva definició ha estat estandaritzada, fa només vint anys, com a la resposta inflamatòria sistèmica a una infecció corfimada. La recerca presentada en aquest document fa referència a l'anàlisi de dades de la Sèpsia en general i, de forma més específica, al problema de la predicció de la supervivència de malalts afectats amb Sèpsia Greu. Les eines i mètodes que formen la clau de bòveda d'aquest treball provenen de diversos camps com l'estadística multivariant i algebràica, geometria algebraica, aprenentatge automàtic i inteligència computacional. Més enllà de l'anàlisi per-se, aquesta tesi també presenta una contribució des de el punt de vista clínic atès que presenta evidència substancial en el debat sobre l'impacte de l'administració d'estatines previ a l'ingrès a la UCI en els malalts sèptics. També s'aclareix la forta dependència entre el xoc sèptic i el Síndrome de Disfunció Multiorgànica. Finalment, també es defineix un conjunt de descriptors latents de la Sèpsia com a factors de pronòstic per a la predicció de la mortalitat, que millora sobre els mètodes actualment més utilitzats en la UCI.

  • Feature selection for the prediction and visualization of brain tumor types using proton magnetic resonance spectroscopy data

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio
    Date of publication: 2012
    Book chapter

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    In cancer diagnosis, classification of the different tumor types is of great importance. An accurate prediction of basic tumor types provides better treatment and may minimize the negative impact of incorrectly targeted toxic or aggressive treatments. Moreover, the correct prediction of cancer types in the brain using non-invasive information –e.g. 1H-MRS data– could avoid patients to suffer collateral problems derived from exploration techniques that require surgery. We present a feature selection algorithm that is specially designed to be used in 1H-MRS (Proton Magnetic Resonance Spectroscopy) data of brain tumors. This algorithm takes advantage of the fact that some metabolic levels may consistently present notorious differences between specific tumor types. We present detailed experimental results using an international dataset in which highly attractive models are obtained. The models are evaluated according to their accuracy, simplicity and medical interpretability. We also explore the influence of redundancy in the modelling process. Our results suggest that a moderate amount of redundant metabolites can actually enhance class-separability and therefore accuracy.

  • Access to the full text
    Similarity networks for heterogeneous data  Open access

     Belanche Muñoz, Luis Antonio; Hernández González, Jerónimo
    European Symposium on Artificial Neural Networks
    Presentation's date: 2012-04
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    A two-layer neural network is developed in which the neuron model computes a user-defined similarity function between inputs and weights. The neuron model is formed by the composition of an adapted logistic function with the mean of the partial input-weight similarities. The model is capable of dealing directly with variables of potentially different nature (continuous, ordinal, categorical); there is also provision for missing values. The network is trained using a fast two-stage procedure and involves the setting of only one parameter. In our experiments, the network achieves slightly superior performance on a set of challenging problems with respect to both RBF nets and RBF-kernel SVMs.

  • Access to the full text
    Averaging of kernel functions  Open access

     Belanche Muñoz, Luis Antonio; Tosi, Alessandra
    European Symposium on Artificial Neural Networks
    Presentation's date: 2012-04
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In kernel-based machines, the integration of several kernels to build more flexible learning methods is a promising avenue for research. In particular, in Multiple Kernel Learning a compound kernel is build by learning a kernel that is the weighted mean of several sources. We show in this paper that the only feasible average for kernel learning is precisely the arithmetic average. We also show that three familiar means (the geometric, inverse root mean square and harmonic means) for positive real values actually generate valid kernels.

  • Access to the full text
    Desarrollo integral de las competencias genéricas mediante mapas competenciales  Open access

     Sanchez Carracedo, Fermin; Ageno Pulido, Alicia; Belanche Muñoz, Luis Antonio; Cabre Garcia, Jose Maria; Cobo Valeri, Erik; Farre Cirera, Rafael; Garcia Almiñana, Jordi; Lopez Alvarez, David; Mares Marti, Pere; Martin Escofet, Carme; Soler Cervera, Antonia
    Jornadas de Enseñanza Universitaria de la Informática
    Presentation's date: 2012-07
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Los planes de estudio del EEES deben diseñarse a partir de las competencias de la titulación, tanto específicas como genéricas. La universidad española tiene una amplia experiencia en trabajar y evaluar las competencias específicas, pero las competencias genéricas suponen un nuevo reto que es preciso abordar. En este trabajo se hace una propuesta sobre cómo trabajar y evaluar, de forma global, las competencias genéricas en una titulación de Grado. La propuesta se está implantando en los estudios de Grado en Ingeniería Informática de la Facultat d’Informàtica de Barcelona. En lugar de establecer diversos niveles de competencia y asignar cada uno de estos niveles a distintas asignaturas, como suele hacerse con las competencias específicas usando la taxonomía de Bloom, se propone definir cada competencia genérica en términos de dimensiones. Cada una de las dimensiones (aspectos de la competencia) se define en términos de objetivos a tres niveles, y son los objetivos de un determinado nivel de cada dimensión lo que se encarga a las asignaturas. De esta forma, una misma asignatura puede trabajar distintas dimensiones de una competencia genérica, cada una de ellas a un nivel diferente. Diferentes competencias pueden compartir un subconjunto de dimensiones. Evitar repetir el trabajo de estas dimensiones en diferentes asignaturas cuando no es estrictamente necesario permite optimizar el trabajo realizado y favorece que los estudiantes adquieran las competencias genéricas definidas por la titulación.

  • Classifying malignant brain tumours from 1H-MRS data using Breadth Ensemble Learning

     Vilamala Muñoz, Albert; Belanche Muñoz, Luis Antonio; Vellido Alcacena, Alfredo
    International Conference on Neural Networks
    Presentation's date: 2012
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • SIGNAL PROCESSING TECHNIQUES FOR BRAIN TUMOUR DIAGNOSIS FROM MAGNETIC RESONANCE SPECTROSCOPY DATA

     Arizmendi Pereira, Carlos Julio
    Defense's date: 2012-02-10
    Department of Software, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • MINERIA EN DATOS BIOLOGICOS Y SOCIALES: ALGORITMOS, TEORIA E IMPLEMENTACION

     Morrill, Glyn Verden; Quattoni, Ariadna Julieta; Arratia Quesada, Argimiro Alejandro; De Balle Pigem, Borja; Arias Vicente, Marta; Casas Fernandez, Bernardino; Bifet Figuerol, Albert Carles; Berral Garcia, Josep Lluis; Lopez Herrera, Josefina; Baixeries i Juvillà, Jaume; Delgado Pin, Jordi; Belanche Muñoz, Luis Antonio; Castro Rabal, Jorge; Lozano Bojados, Antoni; Ferrer Cancho, Ramon; Sierra Santibañez, Maria Josefina; Gavaldà Mestre, Ricard
    Participation in a competitive project

     Share

  • Statistical approaches for modeling in microbial source tracking

     Belanche Muñoz, Luis Antonio; Blanch, Anicet R.
    Date of publication: 2011
    Book chapter

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Microbial source tracking (MST) concerns the definition of new indicators and appropriate detection methods, the identification of host-specific indicators of fecal pollution, and ultimately the development of useful and reliable predictive models for practical deployment. Optimal predictive models should be designed using proper statistical and computational tools for the analysis of the available data samples. A further requirement is found in the determination of appropriate sets of predictors (indicators, tracers) for developing accurate and low-cost MST solutions. This chapter briefly reviews some of these modeling tools, and their use and feasibility in providing more accurate MST-based results. It also evaluates the potential of established and new algorithmic methods to the identification of fecal pollution sources.

  • Learning with heterogeneous neural networks

     Belanche Muñoz, Luis Antonio
    Date of publication: 2011
    Book chapter

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    This chapter studies a class of neuron models that computes a user-defined similarity function between inputs and weights. The neuron transfer function is formed by composition of an adapted logistic function with the quasi-linear mean of the partial input-weight similarities. The neuron model is capable of dealing directly with mixtures of continuous as well as discrete quantities, among other data types and there is provision for missing values. An artificial neural network using these neuron models is trained using a breeder genetic algorithm until convergence. A number of experiments are carried out in several real-world problems in very different application domains described by mixtures of variales of distinct types and eventually showing missing values. This heterogeneous network is compared to a standard radial basis function network and to a multi-layer perceptron networks and shown to learn from with superior generalization ability at a comparable computational cost. A further important advantage of the resulting neural solutions is the great interpretability of the learned weights, which is done in terms of weighted similarities to prototypes.

  • Access to the full text
    Parsimonious selection of useful genes in microarray gene expression data  Open access

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio
    Date of publication: 2011
    Book chapter

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Machine learning methods have of late made significant efforts to solving multidisciplinary problems in the field of cancer classification in microarray gene expression data. These tasks are characterized by a large number of features and a few observations, making the modeling a nontrivial undertaking. In this study, we apply entropic filter methods for gene selection, in combination with several off-the-shelf classifiers. The introduction of bootstrap resampling techniques permits the achievement of more stable performance estimates. Our findings show that the proposed methodology permits a drastic reduction in dimension, offering attractive solutions in terms of both prediction accuracy and number of explanatory genes; a dimensionality reduction technique preserving discrimination capabilities is used for visualization of the selected genes.

    Machine Learning methods have of late made significant efforts to solving multidisciplinary problems in the field of cancer classification in microarray gene expression data. These tasks are characterized by a large number of features and a few observations, making the modeling a non-trivial undertaking. In this work we apply entropic filter methods for gene selection, in combination with several off-the-shelf classifiers. The introduction of bootstrap resampling techniques permits the achievement of more stable performance estimates. Our findings show that the proposed methodology permits a drastic reduction in dimension, offering attractive solutions both in terms of prediction accuracy and number of explanatory genes; a dimensionality reduction technique preserving discrimination capabilities is used for visualization of the selected genes.

    Postprint (author’s final draft)

  • Things to know about a (dis)similarity measure

     Belanche Muñoz, Luis Antonio; Orozco, Jorge
    International Conference on Knowledge-Based and Intelligent Information and Engineering Systems
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Access to the full text
    Predicting software anomalies using machine learning techniques  Open access

     Alonso, Javier; Belanche Muñoz, Luis Antonio; Avresky, Dimiter
    IEEE International Symposium on Network Computing and Applications
    Presentation's date: 2011
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In this paper, we present a detailed evaluation of a set of well-known Machine Learning classifiers in front of dynamic and non-deterministic software anomalies. The system state prediction is based on monitoring system metrics. This allows software proactive rejuvenation to be triggered automatically. Random Forest approach achieves validation errors less than 1% in comparison to the well-known ML algorithms under avaluation. In order to reduce automatically the number of monitored parameters, needed to predict software anomalies, we analyze Lasso Regularization technique jointly with the Machine Learning classifiers to evaluate how the prediction accuracy could be guaranteed within an acceptable threshold. This allows to reduce drastically (around 60% in the best case) the number of monitoring parameters. The framework, based on ML and Lasso regularization techniques, has been validated using an e-commerce environment with Apache Tomcat server, and MySql database server.

    Postprint (author’s final draft)

  • Access to the full text
    Feature selection in proton magnetic resonance spectroscopy data of brain tumors  Open access

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio
    International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics
    Presentation's date: 2011-07
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In cancer diagnosis, classification of the different tumor types is of great importance. An accurate prediction of different tumor types provides better treatment and may minimize the negative impact of incorrectly targeted toxic or aggressive treatments. Moreover, the correct prediction of cancer types using non-invasive information –e.g. 1H-MRS data– could avoid patients to suffer collateral problems derived from exploration techniques that require surgery. A Feature Selection Algorithm specially designed to be use in 1H-MRS Proton Magnetic Resonance Spectroscopy data of brain tumors is presented. It takes advantage of a highly distinctive aspect in this data: some metabolite levels are notoriously different between types of tumors. Experimental read- ings on an international dataset show highly competitive models in terms of accuracy, complexity and medical interpretability.

    Postprint (author’s final draft)

  • Access to the full text
    A software system for the microbial source tracking problem  Open access

     Sànchez Mendoza, David; Belanche Muñoz, Luis Antonio; Blanch, Anicet R.
    Workshop on Applications of Pattern Analysis
    Presentation's date: 2011
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    The aim of this paper is to report the achievement of Ichnaea, a fully computer-based prediction system that is able to make fairly accurate predictions for Microbial Source Tracking studies. The system accepts examples showing diff erent concentration levels, uses indicators (variables) with diff erent environmental persistence, and can be applied at diff erent geographical or climatic areas. We describe the inner workings of the system and report on the specifi c problems and challenges arisen from the machine learning point of view and how they have been addressed.

  • Feature Selection in cancer research: Microarray Gene Expression and in vivo 1H-MRS domains

     González Navarro, Félix Fernando
    Defense's date: 2011-06-03
    Department of Software, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Proactive Software Rejuvenation solution for web enviroments on virtualized platforms

     Alonso López, Javier
    Defense's date: 2011-02-21
    Department of Computer Architecture, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Parsimonious selection of useful genes in microarray gene expression data

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio
    Advances in experimental medicine and biology
    Date of publication: 2011
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Things to know about a (dis)similarity measure

     Belanche Muñoz, Luis Antonio; Orozco, Jorge
    Lecture notes in artificial intelligence
    Date of publication: 2011
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Access to the full text
    Exploiting the accumulated evidence for gene selection in microarray gene expression data  Open access

     Prat Masramon, Gabriel; Belanche Muñoz, Luis Antonio
    European Conference on Artificial Intelligence
    Presentation's date: 2010
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Feature subset selection (FSS) methods play an important role for cancer classification using microarray gene expression data. In this scenario, it is extremely important to select genes by taking into account the possible interactions with other gene subsets. This paper shows that, by accumulating the evidence in favour (or against) each gene along a search process, the obtained gene subsets may constitute better solutions, either in terms of size or in predictive accuracy, or in both, at a negligible overhead in computational cost.

    Postprint (author’s final draft)

  • Access to the full text
    Differentiation of glioblastomas and metastases using 1H-MRS spectral data  Open access

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio
    International Conference on Bioinformatics & Computational Biology
    Presentation's date: 2010
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Hydrogen-1 magnetic resonance spectroscopy (1H-MRS) allows noninvasive in vivo quantification of metabolite concentrations in brain tissue. In this work two of the most aggressive brain tumors are studied with the purpose of differentiating them. The challenging aspect in this task resides in that their radiological appearance is often similar, despite the fact that treatment of patients suffering these conditions is quite different. Efforts to differentiate between these two profiles are getting increasing attention, mainly because the consequences of performing an incorrect diagnosis. Due to the high dimensionality, initiatives oriented to reduce the description complexity become important. In this work we present a feature selection algorithm that generates relevant subsets of spectral frequencies. Experimental results deliver models that are both simple in terms of numbers of frequencies and show good generalization capabilities.

    Postprint (author’s final draft)

  • Molecular Indicators Used in the Development of Predictive Models for Microbial Source Tracking

     Balleste, Elisenda; Bonjoch, Xavier; Belanche Muñoz, Luis Antonio; Blanch, Anicet R.
    Applied and environmental microbiology
    Date of publication: 2010-03
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Feature and model selection with discriminatory visualization for diagnostic classification of brain tumors

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio; Romero Merino, Enrique; Vellido Alcacena, Alfredo; Julià Sapé, Margarida; Arús, Carles
    Neurocomputing
    Date of publication: 2010-10
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • AIDTUMOUR: HERRAMIENTAS BASADAS EN METODOS DE INTELIGENCIA ARTIFICIAL PARA EL APOYO A LA DECISION EN

     Nebot Castells, Maria Angela; Mugica Alvarez, Francisco José; Belanche Muñoz, Luis Antonio; Vellido Alcacena, Alfredo
    Participation in a competitive project

     Share

  • NEURAL NETWORKS FOR VARIATIONAL PROBLEMS IN ENGINEERING.

     López González, Roberto
    Defense's date: 2009-01-16
    Department of Software, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Access to the full text
    A kernel extension to handle missing data  Open access

     Nebot Troyano, Guillermo; Belanche Muñoz, Luis Antonio
    SGAI International Conference on Artificial Intelligence
    Presentation's date: 2009
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    An extension for univariate kernels that deals with missing values is proposed. These extended kernels are shown to be valid Mercer kernels and can adapt to many types of variables, such as categorical or continuous. The proposed kernels are tested against standard RBF kernels in a variety of benchmark problems showing different amounts of missing values and variable types. Our experimental results are very satisfactory, because they usually yield slight to much better improvements over those achieved with standard methods.

    Postprint (author’s final draft)

  • Access to the full text
    Remainder subset awareness for feature subset selection  Open access

     Prat Masramon, Gabriel; Belanche Muñoz, Luis Antonio
    SGAI International Conference on Artificial Intelligence
    Presentation's date: 2009
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Feature subset selection has become more and more a common topic of research. This popularity is partly due to the growth in the number of features and application domains. It is of the greatest importance to take themost of every evaluation of the inducer, which is normally the more costly part. In this paper, a technique is proposed that takes into account the inducer evaluation both in the current subset and in the remainder subset (its complementary set) and is applicable to any sequential subset selection algorithm at a reasonable overhead in cost. Its feasibility is demonstrated on a series of benchmark data sets.

    Postprint (author’s final draft)

  • Access to the full text
    Using machine learning techniques to explore 1H-MRS data of brain tumors  Open access

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio
    Mexican International Conference on Artificial Intelligence
    Presentation's date: 2009
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Machine learning is a powerful paradigm to analyze Proton Magnetic Resonance Spectroscopy (1H-MRS) spectral data for the classification of brain tumor pathologies. An important characteristic of this task is the high dimensionality of the involved data sets. In this work we apply filter feature selection methods on three types of 1H-MRS spectral data: long echo time, short echo time and an ad hoc combination of both. The experimental findings show that feature selection permits to drastically reduce the dimension, offering at the same time very attractive solutions both in terms of prediction accuracy and the ability to interpret the involved spectral frequencies. A linear dimensionality reduction technique that preserves the class discrimination capabilities is additionally used for visualization of the selected frequencies.

    Postprint (author’s final draft)

  • Machine learning methods for classifying normal vs. tumorous tissue with spectral data

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio
    Congreso Internacional de Informática y Computación
    Presentation's date: 2009
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Using machine learning techniques to explore H-1-MRS data of brain tumors

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio
    Mexican International Conference on Artificial Intelligence
    Presentation's date: 2009-11
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • New multiplatform computer program for numerical identification of microorganisms

     Belanche Muñoz, Luis Antonio; Blanch, Anicet R.; Flores Baquero, Oscar
    Journal of clinical microbiology
    Date of publication: 2009-12
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    The classification of bacteria by using genomic methods or expensive biochemical-based commercial kits is sometimes beyond the reach of many laboratories that need to perform numerous classifications of unknown bacterial strains in a fast, cheap, and reliable way. A new computer program, Identax, for the computer-assisted identification of microorganisms by using only results obtained from conventional biochemical tests is presented. Identax improves current microbial identification software and provides a multiplatform and userfriendly program. It can be executed from any operating system and can be downloaded without any cost from the Identax website (www.identax.org).

  • Outlier exploration and diagnostic classification of a multi-centre 1H-MRS brain tumour database

     Vellido Alcacena, Alfredo; Romero Merino, Enrique; González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio; Julià Sapé, Margarida; Arús, Carles
    Neurocomputing
    Date of publication: 2009
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Gene subset selection in microarray data using entropic filtering for cancer classification

     González, F; Belanche Muñoz, Luis Antonio
    Expert systems
    Date of publication: 2009-02
    Journal article

     Share Reference managers Reference managers Open in new window

  • Feature and model selection in 1H-MRS single voxel spectra for cancer classification

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio
    Date of publication: 2009-01-31
    Book chapter

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Machine learning is a powerful paradigm within which to analyze 1HMRS spectral data for the classification of tumour pathologies. An important characteristic of this task is the high dimensionality of the involved data sets. In this work we apply specific feature selection methods in order to reduce the complexity of the problem on two types of 1H-MRS spectral data: long-echo and short-echo time, which present considerable differences in the spectrum for the same cases. The experimental findings show that the feature selection methods enhance the classification performance of the models induced by several off-the-shelf classifiers and are able to offer very attractive solutions both in terms of prediction accuracy and number of involved spectral frequencies.

  • Similarity-based heterogeneous neuron models

     Belanche Muñoz, Luis Antonio
    Date of publication: 2009-09-30
    Book chapter

     Share Reference managers Reference managers Open in new window

  • Distance-based kernels for real-valued data

     Belanche Muñoz, Luis Antonio; Vázquez, J L; Vázquez, M
    Date of publication: 2009-09-30
    Book chapter

     Share Reference managers Reference managers Open in new window

  • Generative manifold learning for the exploration of partially labeled data  Open access

     Cruz Barbosa, Raul
    Defense's date: 2009-10-01
    Department of Software, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In many real-world application problems, the availability of data labels for supervised learning is rather limited. Incompletely labeled datasets are common in many of the databases generated in some of the currently most active areas of research. It is often the case that a limited number of labeled cases is accompanied by a larger number of unlabeled ones. This is the setting for semi-supervised learning, in which unsupervised approaches assist the supervised problem and vice versa. A manifold learning model, namely Generative Topographic Mapping (GTM), is the basis of the methods developed in this thesis. The non-linearity of the mapping that GTM generates makes it prone to trustworthiness and continuity errors that would reduce the faithfulness of the data representation, especially for datasets of convoluted geometry. In this thesis, a variant of GTM that uses a graph approximation to the geodesic metric is first defined. This model is capable of representing data of convoluted geometries. The standard GTM is here modified to prioritize neighbourhood relationships along the generated manifold. This is accomplished by penalizing the possible divergences between the Euclidean distances from the data points to the model prototypes and the corresponding geodesic distances along the manifold. The resulting Geodesic GTM (Geo-GTM) model is shown to improve the continuity and trustworthiness of the representation generated by the model, as well as to behave robustly in the presence of noise. The thesis then leads towards the definition and development of semi-supervised versions of GTM for partially-labeled data exploration. As a first step in this direction, a two-stage clustering procedure that uses class information is presented. A class information-enriched variant of GTM, namely class-GTM, yields a first cluster description of the data. The number of clusters defined by GTM is usually large for visualization purposes and does not necessarily correspond to the overall class structure. Consequently, in a second stage, clusters are agglomerated using the K-means algorithm with different novel initialization strategies that benefit from the probabilistic definition of GTM. We evaluate if the use of class information influences cluster-wise class separability. A robust variant of GTM that detects outliers while effectively minimizing their negative impact in the clustering process is also assessed in this context. We then proceed to the definition of a novel semi-supervised model, SS-Geo-GTM, that extends Geo-GTM to deal with semi-supervised problems. In SS-Geo-GTM, the model prototypes are linked by the nearest neighbour to the data manifold constructed by Geo-GTM. The resulting proximity graph is used as the basis for a class label propagation algorithm. The performance of SS-Geo-GTM is experimentally assessed, comparing positively with that of an Euclidean distance-based counterpart and that of the alternative Laplacian Eigenmaps method. Finally, the developed models (the two-stage clustering procedure and the semi-supervised models) are applied to the analysis of a human brain tumour dataset (obtained by Nuclear Magnetic Resonance Spectroscopy), where the tasks are, in turn, data clustering and survival prognostic modeling.

    Resum de la tesi (màxim 4000 caràcters. Si se supera aquest límit, el resum es tallarà automàticament al caràcter 4000) En muchos problemas de aplicación del mundo real, la disponibilidad de etiquetas de datos para aprendizaje supervisado es bastante limitada. La existencia de conjuntos de datos etiquetados de manera incompleta es común en muchas de las bases de datos generadas en algunas de las áreas de investigación actualmente más activas. Es frecuente que un número limitado de casos etiquetados venga acompañado de un número mucho mayor de datos no etiquetados. Éste es el contexto en el que opera el aprendizaje semi-supervisado, en el cual enfoques no-supervisados prestan ayuda a problemas supervisados y vice versa. Un modelo de aprendizaje de variaciones (manifold learning, en inglés), llamado Mapeo Topográfico Generativo (GTM, en acrónimo de su nombre en inglés), es la base de los métodos desarrollados en esta tesis. La no-linealidad del mapeo que GTM genera hace que éste sea propenso a errores de fiabilidad y continuidad, los cuales pueden reducir la fidelidad de la representación de los datos, especialmente para conjuntos de datos de geometría intrincada. En esta tesis, una extensión de GTM que utiliza una aproximación vía grafos a la métrica geodésica es definida en primer lugar. Este modelo es capaz de representar datos con geometrías intrincadas. En él, el GTM estándar es modificado para priorizar relaciones de vecindad a lo largo de la variación generada. Esto se logra penalizando las divergencias existentes entre las distancias Euclideanas de los datos a los prototipos del modelo y las correspondientes distancias geodésicas a lo largo de la variación. Se muestra que el modelo Geo-GTM resultante mejora la continuidad y fiabilidad de la representación generada y que se comporta de manera robusta en presencia de ruido. Más adelante, la tesis nos lleva a la definición y desarrollo de versiones semi-supervisadas de GTM para la exploración de conjuntos de datos parcialmente etiquetados. Como un primer paso en esta dirección, se presenta un procedimiento de agrupamiento en dos etapas que utiliza información de pertenencia a clase. Una extensión de GTM enriquecida con información de pertenencia a clase, llamada class-GTM, produce una primera descripción de grupos de los datos. El número de grupos definidos por GTM es normalmente grande para propósitos de visualización y no corresponde necesariamente con la estructura de clases global. Por ello, en una segunda etapa, los grupos son aglomerados usando el algoritmo K-means con diferentes estrategias de inicialización novedosas las cuales se benefician de la definición probabilística de GTM. Evaluamos si el uso de información de clase influye en la separabilidad de clase por grupos. Una extensión robusta de GTM que detecta datos atípicos a un tiempo que minimiza de forma efectiva su impacto negativo en el proceso de agrupamiento es evaluada también en este contexto. Se procede después a la definición de un nuevo modelo semi-supervisado, SS-Geo-GTM, que extiende Geo-GTM para ocuparse de problemas semi-supervisados. En SS-Geo-GTM, los prototipos del modelo son vinculados al vecino más cercano a la variación construída por Geo-GTM. El grafo de proximidad resultante es utilizado como base para un algoritmo de propagación de etiquetas de clase. El rendimiento de SS-Geo-GTM es valorado experimentalmente, comparando positivamente tanto con la contraparte de este modelo basada en la distancia Euclideana como con el método alternativo Laplacian Eigenmaps. Finalmente, los modelos desarrollados (el procedimiento de agrupamiento en dos etapas y los modelos semi-supervisados) son aplicados al análisis de un conjunto de datos de tumores cerebrales humanos (obtenidos mediante Espectroscopia de Resonancia Magnética Nuclear), donde las tareas a realizar son el agrupamiento de datos y el modelado de pronóstico de supervivencia.

  • An experimental study on methods for the selection of basis functions in regression

     Barrio Moliner, Ignacio; Romero Merino, Enrique; Belanche Muñoz, Luis Antonio
    Neurocomputing
    Date of publication: 2009-08
    Journal article

     Share Reference managers Reference managers Open in new window

  • Feature selection in proton magnetic resonance spectroscopy for brain tumor classification

     Belanche Muñoz, Luis Antonio
    European Symposium on Artificial Neural Networks
    Presentation's date: 2008-04-23
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Machine learning methods for microbial source tracking

     Belanche Muñoz, Luis Antonio; Blanch, A R
    Environmental modelling and software
    Date of publication: 2008-03
    Journal article

     Share Reference managers Reference managers Open in new window

  • Effective learning with heterogeneous neural networks

     Belanche Muñoz, Luis Antonio
    Lecture notes in computer science
    Date of publication: 2008-01
    Journal article

     Share Reference managers Reference managers Open in new window

  • On the design of metric relations

     Belanche Muñoz, Luis Antonio; Orozco Luquero, Jorge
    Journal of convergence information technology
    Date of publication: 2008-01
    Journal article

     Share Reference managers Reference managers Open in new window

  • Feature selection in in vivo 1H-MRS single voxel spectra

     González, F; Belanche Muñoz, Luis Antonio
    Lecture notes in computer science
    Date of publication: 2008-01
    Journal article

     Share Reference managers Reference managers Open in new window

  • Modeling heterogeneous data sets with neural networks

     Belanche Muñoz, Luis Antonio
    Date of publication: 2008-01-31
    Book chapter

     Share Reference managers Reference managers Open in new window