Graphic summary
  • Show / hide key
  • Information


Scientific and technological production
  •  

1 to 50 of 171 results
  • Exploration of Customer Churn Routes Using Machine Learning Probabilistic Models  Open access

     García Gómez, David
    Defense's date: 2014-04-10
    Department of Computer Science, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Los procesos actuales de globalización y desregulación están cambiando el marco competitivo en la mayoría de sectores económicos. La aparición de nuevos competidores y tecnologías conlleva un fuerte aumento de la competencia y una preocupación creciente entre las empresas proveedoras de servicios por la creación de lazos más fuertes con los clientes. Muchas de estas empresas están redirigiendo recursos de la captación de nuevos clientes hacia la retención de los ya existentes. En este contexto, el anticiparse a la intención del cliente a abandonar al proveedor, fenómeno también conocido como "churn", y el facilitar la puesta en marcha de acciones enfocadas a la retención de clientes, son elementos claros de ventaja competitiva.La minería de datos, aplicada a información obtenida de los mercados analizados, puede ayudar en procesos de gestión del "churn". En esta tesis, analizamos datos reales de mercado para la investigación del "churn", enfatizando la aplicabilidad y la interpretación de los resultados. Los análisis están basados en la aplicación de modelos de "Statistical Machine Learning" a problemas de "clustering" y visualización, de los cuales se obtiene una segmentación interpretable de los mercados estudiados. Para lograr tal interpretabilidad, se presta mucha atención a la visualización intuitiva de los resultados experimentales. Dado que las técnicas de modelado utilizadas son de naturaleza no lineal, lo que representa un reto no trivial. Presentamos técnicas desarrolladas recientemente para la visualización de datos en modelos latentes no lineales. Estas se inspiran en métodos de representación geográfica y son adecuadas tanto para datos estáticos como para la representación de datos dinámicos.

    The ongoing processes of globalization and deregulation are changing the competitive framework in the majority of economic sectors. The appearance of new competitors and technologies entails a sharp increase in competition and a growing preoccupation among service providing companies with creating stronger bonds with customers. Many of these companies are shifting resources away from the goal of capturing new customers and are instead focusing on retaining existing ones. In this context, anticipating the customer¿s intention to abandon, a phenomenon also known as churn, and facilitating the launch of retention-focused actions represent clear elements of competitive advantage. Data mining, as applied to market surveyed information, can provide assistance to churn management processes. In this thesis, we mine real market data for churn analysis, placing a strong emphasis on the applicability and interpretability of the results. Statistical Machine Learning models for simultaneous data clustering and visualization lay the foundations for the analyses, which yield an interpretable segmentation of the surveyed markets. To achieve interpretability, much attention is paid to the intuitive visualization of the experimental results. Given that the modelling techniques under consideration are nonlinear in nature, this represents a non-trivial challenge. Newly developed techniques for data visualization in nonlinear latent models are presented. They are inspired in geographical representation methods and suited to both static and dynamic data representation.

  • Automated classification of brain tumours from short echo time in vivo MRS data using Gaussian decomposition and Bayesian neural networks

     Arizmendi Pereira, Carlos Julio; Sierra Bueno, Daniel Alfonso; Vellido Alcacena, Alfredo; Romero Merino, Enrique
    Expert systems with applications
    Date of publication: 2014-09-01
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Neuro-oncologists must ultimately rely on their acquired knowledge and accumulated experience to undertake the sensitive task of brain tumour diagnosis. This task strongly depends on indirect, non-invasive measurements, which are the source of valuable data in the form of signals and images. Expert radiologists should benefit from their use as part of an at least partially automated computer-based medical decision support system. This paper focuses on Magnetic Resonance Spectroscopy signal analysis and illustrates a method that combines Gaussian Decomposition, dimensionality reduction by Moving Window with Variance Analysis and classification using adaptively regularized Artificial Neural Networks. The method yields encouraging results in the task of binary classification of human brain tumours, even for tumour types that have seldom been analyzed from this viewpoint. © 2014 Elsevier Ltd. All rights reserved.

  • Sepsis mortality prediction with the Quotient Basis Kernel

     Ribas Ripoll, Vicent; Vellido Alcacena, Alfredo; Romero Merino, Enrique; Ruiz Rodriguez, Juan Carlos
    Artificial intelligence in medicine
    Date of publication: 2014-05-01
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Objective: This paper presents an algorithm to assess the risk of death in patients with sepsis. Sepsis is a common clinical syndrome in the intensive care unit (ICU) that can lead to severe sepsis, a severe state of septic shock or multi-organ failure. The proposed algorithm may be implemented as part of a clinical decision support system that can be used in combination with the scores deployed in the ICU to improve the accuracy, sensitivity and specificity of mortality prediction for patients with sepsis. Methodology: In this paper, we used the Simplified Acute Physiology Score (SAPS) for ICU patients and the Sequential Organ Failure Assessment (SOFA) to build our kernels and algorithms. In the proposed method, we embed the available data in a suitable feature space and use algorithms based on linear algebra, geometry and statistics for inference. We present a simplified version of the Fisher kernel (practical Fisher kernel for multinomial distributions), as well as a novel kernel that we named the Quotient Basis Kernel (QBK). These kernels are used as the basis for mortality prediction using soft-margin support vector machines. The two new kernels presented are compared against other generative kernels based on the Jensen-Shannon metric (centred, exponential and inverse) and other widely used kernels (linear, polynomial and Gaussian). Clinical relevance is also evaluated by comparing these results with logistic regression and the standard clinical prediction method based on the initial SAPS score. Results: As described in this paper, we tested the new methods via cross-validation with a cohort of 400 test patients. The results obtained using our methods compare favourably with those obtained using alternative kernels (80.18% accuracy for the QBK) and the standard clinical prediction method, which are based on the basal SAPS score or logistic regression (71.32% and 71.55%, respectively). The QBK presented a sensitivity and specificity of 79.34% and 83.24%, which outperformed the other kernels analysed, logistic regression and the standard clinical prediction method based on the basal SAPS score. Conclusion: Several scoring systems for patients with sepsis have been introduced and developed over the last 30 years. They allow for the assessment of the severity of disease and provide an estimate of in-hospital mortality. Physiology-based scoring systems are applied to critically ill patients and have a number of advantages over diagnosis-based systems. Severity score systems are often used to stratify critically ill patients for possible inclusion in clinical trials. In this paper, we present an effective algorithm that combines both scoring methodologies for the assessment of death in patients with sepsis that can be used to improve the sensitivity and specificity of the currently available methods.

  • Cartogram visualization for nonlinear manifold learning models

     Vellido Alcacena, Alfredo; Garcia Cortes, David; Nebot Castells, Maria Angela
    Data mining and knowledge discovery
    Date of publication: 2013-07-01
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Real-world applications of multivariate data analysis often stumble upon the barrier of interpretability. Simple data analysis methods are usually easy to interpret, but they risk providing poor data models. More involved methods may instead yield faithful data models, but limited interpretability. This is the case of linear and nonlinear methods for multivariate data visualization through dimensionality reduction. Even though the latter have provided some of the most exciting visualization developments, their practicality is hindered by the difficulty of explaining them in an intuitive manner. The interpretability, and therefore the practical applicability, of data visualization through nonlinear dimensionality reduction (NLDR) methods would improve if, first, we could accurately calculate the distortion introduced by these methods in the visual representation and, second, if we could faithfully reintroduce this distortion into such representation. In this paper, we describe a technique for the reintroduction of the distortion into the visualization space of NLDR models. It is based on the concept of density-equalizing maps, or cartograms, recently developed for the representation of geographic information. We illustrate it using Generative Topographic Mapping (GTM), a nonlinear manifold learning method that can provide both multivariate data visualization and a measure of the local distortion that the model generates. Although illustrated here with GTM, it could easily be extended to other NLDR visualization methods, provided a local distortion measure could be calculated. It could also serve as a guiding tool for interactive data visualization

  • On the Intelligent Management of Sepsis in the Intensive Care Unit  Open access

     Ribas Ripoll, Vicente Jorge
    Defense's date: 2013-01-29
    Department of Computer Science, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    . 41458682Vicente J. Ribas RipollOn the intelligent management of sepsis in the Intensive Care UnitLSI-SOCOAI120304La gestió d'una Unitat de Cures Intensives (UCI) hospitalària presenta uns requisits força específics incloent, entre altres, ladisminució de la taxa de mortalitat, la durada de l'ingrès, la rotació d'infermeres i la comunicació entre metges amb al finalitad dedonar una atenció de qualitat atenent als requisits tant dels malalts com dels familiars. També és força important controlar iminimitzar els error mèdics deguts a canvis sobtats i a la presa ràpida de deicisions assistencials. Al cap i a la fi, la bona gestióde la UCI hauria de resultar en una reducció de la mortalitat i durada d'estada.La presa de decisions en un entorn de crítics suposa un repte de presa de decisions en temps real d'acord a unes guiesclíniques molt restrictives i que, pel que fa a la recerca, poden resultar en problemes ètics força sensibles i complexos. Per tant,el personal sanitari que ha de prendre decisions sobre la gestió de malalts crítics no només requereix eines de suport a ladecisió que siguin fiables sinó que, a més a més, han de ser interpretables. Altrament qualsevol eina de decisió que no presentiaquests trets no és considerarà d'utilitat clínica.Aquesta tesi doctoral adreça aquests requisits mitjançant el desenvolupament d'eines de suport a la decisió per als intensivistes ies focalitza en un dels principals problemes als que s'han denfrontar: el maneig del malalt sèptic. La Sèpsia és una de lespprincipals causes de mortalitats a les UCIS no-coronàries i la seva taxa de mortalitat pot arribar fins a la meitat dels malalts ambxoc sèptic, la seva manifestació més severa. La Sèpsia és un síndrome transversal, que afecta a persones de totes les edats.Sorprenentment, la seva definició ha estat estandaritzada, fa només vint anys, com a la resposta inflamatòria sistèmica a unainfecció corfimada.La recerca presentada en aquest document fa referència a l'anàlisi de dades de la Sèpsia en general i, de forma més específica,al problema de la predicció de la supervivència de malalts afectats amb Sèpsia Greu. Les eines i mètodes que formen la clau debòveda d'aquest treball provenen de diversos camps com l'estadística multivariant i algebràica, geometria algebraica,aprenentatge automàtic i inteligència computacional.Més enllà de l'anàlisi per-se, aquesta tesi també presenta una contribució des de el punt de vista clínic atès que presentaevidència substancial en el debat sobre l'impacte de l'administració d'estatines previ a l'ingrès a la UCI en els malalts sèptics.També s'aclareix la forta dependència entre el xoc sèptic i el Síndrome de Disfunció Multiorgànica. Finalment, també es defineixun conjunt de descriptors latents de la Sèpsia com a factors de pronòstic per a la predicció de la mortalitat, que millora sobre elsmètodes actualment més utilitzats en la UCI.

    The management of the Intensive Care Unit (ICU) in a hospital has its own, very specific requirements that involve, amongst others, issues of risk-adjusted mortality and average length of stay; nurse turnover and communication with physicians; technical quality of care; the ability to meet patient's family needs; and avoid medical error due rapidly changing circumstances and work overload. In the end, good ICU management should lead to an improvement in patient outcomes. Decision making at the ICU environment is a real-time challenge that works according to very tight guidelines, which relate to often complex and sensitive research ethics issues. Clinicians in this context must act upon as much available information as possible, and could therefore, in general, benefit from at least partially automated computer-based decision support based on qualitative and quantitative information. Those taking executive decisions at ICUs will require methods that are not only reliable, but also, and this is a key issue, readily interpretable. Otherwise, any decision tool, regardless its sophistication and accuracy, risks being rendered useless. This thesis addresses this through the design and development of computer based decision making tools to assist clinicians at the ICU. It focuses on one of the main problems that they must face: the management of the Sepsis pathology. Sepsis is one of the main causes of death for non-coronary ICU patients. Its mortality rate can reach almost up to one out of two patients for septic shock, its most acute manifestation. It is a transversal condition affecting people of all ages. Surprisingly, its definition has only been standardized two decades ago as a systemic inflammatory response syndrome with confirmed infection. The research reported in this document deals with the problem of Sepsis data analysis in general and, more specifically, with the problem of survival prediction for patients affected with Severe Sepsis. The tools at the core of the investigated data analysis procedures stem from the fields of multivariate and algebraic statistics, algebraic geometry, machine learning and computational intelligence. Beyond data analysis itself, the current thesis makes contributions from a clinical point of view, as it provides substantial evidence to the debate about the impact of the preadmission use of statin drugs in the ICU outcome. It also sheds light into the dependence between Septic Shock and Multi Organic Dysfunction Syndrome. Moreover, it defines a latent set of Sepsis descriptors to be used as prognostic factors for the prediction of mortality and achieves an improvement on predictive capability over indicators currently in use.

    La gestió d'una Unitat de Cures Intensives (UCI) hospitalària presenta uns requisits força específics incloent, entre altres, la disminució de la taxa de mortalitat, la durada de l'ingrès, la rotació d'infermeres i la comunicació entre metges amb al finalitad de donar una atenció de qualitat atenent als requisits tant dels malalts com dels familiars. També és força important controlar i minimitzar els error mèdics deguts a canvis sobtats i a la presa ràpida de deicisions assistencials. Al cap i a la fi, la bona gestió de la UCI hauria de resultar en una reducció de la mortalitat i durada d'estada. La presa de decisions en un entorn de crítics suposa un repte de presa de decisions en temps real d'acord a unes guies clíniques molt restrictives i que, pel que fa a la recerca, poden resultar en problemes ètics força sensibles i complexos. Per tant, el personal sanitari que ha de prendre decisions sobre la gestió de malalts crítics no només requereix eines de suport a la decisió que siguin fiables sinó que, a més a més, han de ser interpretables. Altrament qualsevol eina de decisió que no presenti aquests trets no és considerarà d'utilitat clínica. Aquesta tesi doctoral adreça aquests requisits mitjançant el desenvolupament d'eines de suport a la decisió per als intensivistes i es focalitza en un dels principals problemes als que s'han denfrontar: el maneig del malalt sèptic. La Sèpsia és una de les principals causes de mortalitats a les UCIS no-coronàries i la seva taxa de mortalitat pot arribar fins a la meitat dels malalts amb xoc sèptic, la seva manifestació més severa. La Sèpsia és un síndrome transversal, que afecta a persones de totes les edats. Sorprenentment, la seva definició ha estat estandaritzada, fa només vint anys, com a la resposta inflamatòria sistèmica a una infecció corfimada. La recerca presentada en aquest document fa referència a l'anàlisi de dades de la Sèpsia en general i, de forma més específica, al problema de la predicció de la supervivència de malalts afectats amb Sèpsia Greu. Les eines i mètodes que formen la clau de bòveda d'aquest treball provenen de diversos camps com l'estadística multivariant i algebràica, geometria algebraica, aprenentatge automàtic i inteligència computacional. Més enllà de l'anàlisi per-se, aquesta tesi també presenta una contribució des de el punt de vista clínic atès que presenta evidència substancial en el debat sobre l'impacte de l'administració d'estatines previ a l'ingrès a la UCI en els malalts sèptics. També s'aclareix la forta dependència entre el xoc sèptic i el Síndrome de Disfunció Multiorgànica. Finalment, també es defineix un conjunt de descriptors latents de la Sèpsia com a factors de pronòstic per a la predicció de la mortalitat, que millora sobre els mètodes actualment més utilitzats en la UCI.

  • A novel semi-supervised methodology for extracting tumor type-specific MRS sources in human brain data

     Ortega Martorell, Sandra; Ruiz Ruiz, Hector Efrain; Vellido Alcacena, Alfredo; Olier, Ivan; Romero Merino, Enrique; Julia Sape, Margarida; Martin, Jose D.; Jarman, Ian H.; Arus, Carles; Lisboa, Paulo J G
    PLoS one
    Date of publication: 2013-12-23
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Background: The clinical investigation of human brain tumors often starts with a non-invasive imaging study, providing information about the tumor extent and location, but little insight into the biochemistry of the analyzed tissue. Magnetic Resonance Spectroscopy can complement imaging by supplying a metabolic fingerprint of the tissue. This study analyzes single-voxel magnetic resonance spectra, which represent signal information in the frequency domain. Given that a single voxel may contain a heterogeneous mix of tissues, signal source identification is a relevant challenge for the problem of tumor type classification from the spectroscopic signal. Methodology/Principal Findings: Non-negative matrix factorization techniques have recently shown their potential for the identification of meaningful sources from brain tissue spectroscopy data. In this study, we use a convex variant of these methods that is capable of handling negatively-valued data and generating sources that can be interpreted as tumor class prototypes. A novel approach to convex non-negative matrix factorization is proposed, in which prior knowledge about class information is utilized in model optimization. Class-specific information is integrated into this semi-supervised process by setting the metric of a latent variable space where the matrix factorization is carried out. The reported experimental study comprises 196 cases from different tumor types drawn from two international, multi-center databases. The results indicate that the proposed approach outperforms a purely unsupervised process by achieving near perfect correlation of the extracted sources with the mean spectra of the tumor types. It also improves tissue type classification...

  • Discriminant convex non-negative matrix factorization for the classification of human brain tumours

     Vilamala Muñoz, Albert; Lisboa, Paulo J G; Ortega Martorell, Sandra; Vellido Alcacena, Alfredo
    Pattern recognition letters
    Date of publication: 2013-10-15
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    The medical analysis of human brain tumours commonly relies on indirect measurements. Among these, magnetic resonance imaging (MRI) and spectroscopy (MRS) predominate in clinical settings as tools for diagnostic assistance. Pattern recognition (PR) methods have successfully been used in this task, usually interpreting diagnosis as a supervised classification problem. In MRS, the acquired spectral signal can be analyzed in an unsupervised manner to extract its constituent sources. Recently, this has been successfully accomplished using Non-negative Matrix Factorization (NMF) methods. In this paper, we present a method to introduce the available class information into the unsupervised source extraction process of a convex variant of NMF. Novel techniques to generate diagnostic predictions for new, unseen spectra using the proposed Discriminant Convex-NMF are also described and experimentally assessed. © 2013 Elsevier B.V. All rights reserved.

    The medical analysis of human brain tumours commonly relies on indirect measurements. Among these, magnetic resonance imaging (MRI) and spectroscopy (MRS) predominate in clinical settings as tools for diagnostic assistance. Pattern recognition (PR) methods have successfully been used in this task, usually interpreting diagnosis as a supervised classification problem. In MRS, the acquired spectral signal can be analyzed in an unsupervised manner to extract its constituent sources. Recently, this has been successfully accomplished using Non-negative Matrix Factorization (NMF) methods. In this paper, we present a method to introduce the available class information into the unsupervised source extraction process of a convex variant of NMF. Novel techniques to generate diagnostic predictions for new, unseen spectra using the proposed Discriminant Convex-NMF are also described and experimentally assessed.

  • Cartogram-based data visualization using the growing hierarchical SOM

     Martin Prat, Angela; Vellido Alcacena, Alfredo
    Congrés Internacional de l¿Associació Catalana d'Intel·ligència Artificial
    Presentation's date: 2013-10
    Presentation of work at congresses

    Read the abstract Read the abstract  Share Reference managers Reference managers Open in new window

    Model interpretability is a problem of knowledge extraction from the patterns found in data to which data visualization can contribute. Nonlinear dimensionality reduction techniques provide flexible visual insight, but their locally varying representation distortion makes interpretation far from intuitive. In this paper, we apply a cartogram method, based on techniques of geographic representation, to data visualization. It allows reintroducing this distortion, measured as a U-matrix, in the visual maps of the Growing Hierarchical Self Organising Map (GHSOM).

  • Telecommunications customers churn monitoring using flow maps and cartogram visualization

     Garcia Cortes, David; Nebot Castells, Maria Angela; Vellido Alcacena, Alfredo
    International Conference on Computer Graphics Theory and Applications and International Conference on Information Visualization Theory and Applications
    Presentation's date: 2013-02
    Presentation of work at congresses

    Read the abstract Read the abstract  Share Reference managers Reference managers Open in new window

    Telecommunication companies compete in increasingly aggressive markets. Avoiding customer defection, or churn, should be at the core of successful management in such context. These companies store and manage abundant customer usage data. Their analysis using advanced techniques can be a source of valuable insight into customers' behavior over time. Exploratory data visualization can help in this task. Many important contributions to multivariate data visualization using nonlinear techniques have recently been made. In this paper, we analyze a database of customer landline telephone usage in Brazil. These data are first visualized using a nonlinear manifold learning model, Generative Topographic Mapping (GTM). This visualization is enhanced using a cartogram technique, inspired in geographical representation methods, that reintroduces the local nonlinear distortion into the representation space. Yet another geographical information visualization technique, namely the Flow Maps, is then used to visualize customer migrations over time periods in the GTM data representation space. The experimental results shown in this paper provide evidence to support that the use of these methods can assist experts in the process of useful knowledge extraction, with an impact on customer retention management strategies.

  • Access to the full text
    Robust cartogram visualization of outliers in manifold learning  Open access

     Tosi, Alessandra; Vellido Alcacena, Alfredo
    European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
    Presentation's date: 2013-04-24
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Most real data sets contain atypical observations, often referred to as outliers. Their presence may have a negative impact in data modeling using machine learning. This is particularly the case in data density estimation approaches. Manifold learning techniques provide low-dimensional data representations, often oriented towards visualization. The visualization provided by density estimation manifold learning methods can be compromised by the presence of outliers. Recently, a cartogram-based representation of model-generated distortion was presented for nonlinear dimensionality reduction. Here, we investigate the impact of outliers on this visualization when using manifold learning techniques that behave robustly in their presence.

    Most real data sets contain atypical observations, often referred to as outliers. Their presence may have a negative impact in data modeling using machine learning. This is particularly the case in data density estimation approaches. Manifold learning techniques provide low-dimensional data representations, often oriented towards visualization. The visualization provided by density estimation manifold learning methods can be compromised by the presence of outliers. Recently, a cartogram-based representation of model-generated distortion was presented for nonlinear dimensionality reduction. Here, we investigate the impact of outliers on this visualization when using manifold learning techniques that behave robustly in their presence.

  • Access to the full text
    Advances in semi-supervised alignment-free classification of G protein-coupled receptors  Open access

     Vellido Alcacena, Alfredo; Cruz Barbosa, Raúl; Giraldo, Jesús
    International Work-Conference on Bioinformatics and Biomedical Engineering
    Presentation's date: 2013-03
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    G Protein-coupled receptors (GPCRs) are integral cell membrane proteins of great relevance for pharmacology due to their role in transducing extracellular signals. The 3-D structure is unknown for most of them, and the investigation of their structure-function relationships usually relies on the construction of 3-D receptor models from amino acid sequence alignment onto those receptors of known structure. Sequence alignment risks the loss of relevant information. Different approaches have attempted the analysis of alignment-free sequences on the basis of amino acid physicochemical properties. In this paper, we use the Auto-Cross Covariance method and compare it to an amino acid composition representation. Novel semi-supervised manifold learning methods are then used to classify the several members of class C GPCRs on the basis of the transformed data. This approach is relevant because protein sequences are not always labeled and methods that provide robust classification for a limited amount of labels are required

    G Protein-coupled receptors (GPCRs) are integral cell membrane proteins of great relevance for pharmacology due to their role in transducing extracellular signals. The 3-D s tructure is unknown for most of them, and the investigation of their structure-function relationships usually relies on the construction of 3-D receptor models from amino acid sequence alignment onto those receptors of known structure. Sequence alignment risks the loss of relevant information. Different approaches have attempted the analysis of alignment-free sequences on the basis of amino acid physicochemical properties. In this paper, we use the Auto-Cross Covariance method and compare it to an amino acid composition representation. Novel semi-supervised manifold learning methods are then used to classify the several members of class C GPCRs on the basis of the transformed data. This approach is relevant because protein sequences are not always labeled and methods that provide robust classification for a limited amount of labels are required.

  • Visualizing pay-per-view television customers churn using cartograms and flow maps

     García Gómez, David; Nebot Castells, Maria Angela; Vellido Alcacena, Alfredo
    European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Media companies aggressively compete for their share of the pay-per-view television market. Such share can only be kept or improved by avoiding customer defection, or churn. The analysis of customers' data should provide insight into customers' behavior over time and help preventing churn. Data visualization can be part of this analysis. Here, a database of pay-per-view television customers is visualized using a nonlinear manifold learning model. This visualization is enhanced through, first, the reintroduction of the local nonlinear distortion using a cartogram technique and, second, the visualization of customer migrations using flow maps. Both techniques are inspired by geographical representation.

  • SVM-based classification of class C GPCRs from alignment-free physicochemical transformations of their sequences

     König, Caroline; Cruz Barbosa, Raúl; Alquezar Mancho, Renato; Vellido Alcacena, Alfredo
    International Conference on Image Analysis and Processing
    Presentation's date: 2013-09-09
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    G protein-coupled receptors (GPCRs) have a key function in regulating the function of cells due to their ability to transmit extracelullar signals. Given that the 3D structure and the functionality of most GPCRs is unknown, there is a need to construct robust classification models based on the analysis of their amino acid sequences for protein homology detection. In this paper, we describe the supervised classification of the different subtypes of class C GPCRs using support vector machines (SVMs). These models are built on different transformations of the amino acid sequences based on their physicochemical properties. Previous research using semi-supervised methods on the same data has shown the usefulness of such transformations. The obtained classification models show a robust performance, as their Matthews correlation coefficient is close to 0.91 and their prediction accuracy is close to 0.93. © 2013 Springer-Verlag.

    G protein-coupled receptors (GPCRs) have a key function in regulating the function of cells due to their ability to transmit extracelullar signals. Given that the 3D structure and the functionality of most GPCRs is unknown, there is a need to construct robust classification models based on the analysis of their amino acid sequences for protein homology detection. In this paper, we describe the supervised classification of the different subtypes of class C GPCRs using support vector machines (SVMs). These models are built on different transformations of the amino acid sequences based on their physicochemical properties. Previous research using semi-supervised methods on the same data has shown the usefulness of such transformations. The obtained classification models show a robust performance, as their Matthews correlation coefficient is close to 0.91 and their prediction accuracy is close to 0.93.

    Postprint (author’s final draft)

  • Access to the full text
    A Quotient Basis Kernel for the prediction of mortality in severe sepsis patients  Open access

     Ribas Ripoll, Vicent; Romero Merino, Enrique; Ruiz Rodriguez, Juan Carlos; Vellido Alcacena, Alfredo
    European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
    Presentation's date: 2013-04-24
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In this paper, we describe a novel kernel for multinomial distributions, namely the Quotient Basis Kernel (QBK), which is based on a suitable reparametrization of the input space through algebraic geometry and statistics. The QBK is used here for data transformation prior to classification in a medical problem concerning the prediction of mortality in patients suffering severe sepsis. This is a common clinical syndrome, often treated at the Intensive Care Unit (ICU) in a time-critical context. Mortality prediction results with Support Vector Machines using QBK compare favorably with those obtained using alternative kernels and standard clinical procedures.

    In this paper, we describe a novel kernel for multinomial distributions, namely the Quotient Basis Kernel (QBK), which is based on a suitable reparametrization of the input space through algebraic geometry and statistics. The QBK is used here for data transformation prior to classification in a medical problem concerning the prediction of mortality in patients suffering severe sepsis. This is a common clinical syndrome, often treated at the Intensive Care Unit (ICU) in a time-critical context. Mortality prediction results with Support Vector Machines using QBK compare favorably with those obtained using alternative kernels and standard clinical procedures.

  • Convex non-negative matrix factorization for brain tumor delimitation from MRSI data

     Ortega Martorell, Sandra; Lisboa, Paulo J.G.; Vellido Alcacena, Alfredo; Simoes, Rui V.; Pumarola, Martí; Julià Sapé, Margarida; Arús, Carles
    PLoS one
    Date of publication: 2012-10-23
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Robust discrimination of glioblastomas from metastatic brain tumors on the basis of single-voxel 1H MRS

     Vellido Alcacena, Alfredo; Romero Merino, Enrique; Julià Sapé, Margarida; Majós, C.; Moreno Torres, À.; Pujol, Jesus; Arús, Carles
    NMR in biomedicine
    Date of publication: 2012-06
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Classification of human brain tumours from MRS data using Discrete Wavelet Transform and Bayesian Neural Networks

     Arizmendi Pereira, Carlos Julio; Vellido Alcacena, Alfredo; Romero Merino, Enrique
    Expert systems with applications
    Date of publication: 2012-04
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Computational intelligence methods for bioinformatics and biostatistics

    Date of publication: 2012
    Book

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    In cancer diagnosis, classification of the different tumor types is of great importance. An accurate prediction of basic tumor types provides better treatment and may minimize the negative impact of incorrectly targeted toxic or aggressive treatments. Moreover, the correct prediction of cancer types in the brain using non-invasive information –e.g. 1H-MRS data– could avoid patients to suffer collateral problems derived from exploration techniques that require surgery. We present a feature selection algorithm that is specially designed to be used in 1H-MRS (Proton Magnetic Resonance Spectroscopy) data of brain tumors. This algorithm takes advantage of the fact that some metabolic levels may consistently present notorious differences between specific tumor types. We present detailed experimental results using an international dataset in which highly attractive models are obtained. The models are evaluated according to their accuracy, simplicity and medical interpretability. We also explore the influence of redundancy in the modelling process. Our results suggest that a moderate amount of redundant metabolites can actually enhance class-separability and therefore accuracy.

  • SIGNAL PROCESSING TECHNIQUES FOR BRAIN TUMOUR DIAGNOSIS FROM MAGNETIC RESONANCE SPECTROSCOPY DATA

     Arizmendi Pereira, Carlos Julio
    Defense's date: 2012-02-10
    Department of Computer Science, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Discovering hidden pathways in bioinformatics

     Lisboa, Paulo J.G.; Jarman, Ian H.; Etchells, Terence A.; Chambers, Simon J.; Bacciu, Davide; Whittaker, Joe; Garibaldi, Jon M.; Ortega Martorell, Sandra; Vellido Alcacena, Alfredo; Ellis, Ian O.
    Date of publication: 2012
    Book chapter

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Intelligent management of sepsis in the intensive care unit

     Ribas Ripoll, Vicent; Ruiz Rodriguez, Juan Carlos; Vellido Alcacena, Alfredo
    Date of publication: 2012-06
    Book chapter

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Non-negative matrix factorisation methods for the spectral decomposition of MRS data from human brain tumours

     Ortega Martorell, Sandra; Lisboa, Paulo J.G.; Vellido Alcacena, Alfredo; Julià Sapé, Margarida; Arús, Carles
    BMC bioinformatics
    Date of publication: 2012-03-08
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Complementing kernel-based visualization of protein sequences with their phylogenetic tree

     Cárdenas, Martha Ivón; Vellido Alcacena, Alfredo; Olier, Iván; Rovira, Xavier; Giraldo, Jesús
    Date of publication: 2012
    Book chapter

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Kernel generative topographic mapping of protein sequences

     Cárdenas, Martha Ivón; Vellido Alcacena, Alfredo; Olier Caparroso, Ivan Alberto; Rovira, Xavier; Giraldo, Jesús
    Date of publication: 2012-06
    Book chapter

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • On the use of graphical models to study ICU outcome prediction in septic patients treated with statins

     Ribas Ripoll, Vicent; Caballero López, Jesús; Sáez de Tejada, Anna; Ruiz Rodriguez, Juan Carlos; Ruiz Sanmartin, Adolfo; Rello, Jordi; Vellido Alcacena, Alfredo
    Date of publication: 2012
    Book chapter

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Preprocessing MRS information for classification of human brain tumours

     Arizmendi Pereira, Carlos Julio; Vellido Alcacena, Alfredo; Romero Merino, Enrique
    Date of publication: 2012-06
    Book chapter

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Towards interpretable classifiers with blind signal separation

     Ruiz, Hector; Ortega Martorell, Sandra; Jarman, Ian H.; Vellido Alcacena, Alfredo; Martin Guerrero, Jose D.; Romero Merino, Enrique; Lisboa, Paulo J.G.
    International Conference on Neural Networks
    Presentation's date: 2012
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Making machine learning models interpretable

     Vellido Alcacena, Alfredo; Martin Guerrero, Jose D.; Lisboa, Paulo J.G.
    European Symposium on Artificial Neural Networks
    Presentation's date: 2012-04
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Unsupervised tumour area delimitation in glioblastoma multiforme using non-negative matrix factorisation of MRSI grids

     Ortega Martorell, Sandra; Lisboa, Paulo J.G.; Vellido Alcacena, Alfredo; Simoes, Rui V.; Pumarola, Martí; Julià Sapé, Margarida; Arús, C.
    Annual Scientific Meeting of the European Society for Magnetic Resonance in Medicine and Biology
    Presentation's date: 2012-10-06
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Cartogram representation of the batch-SOM magnification factor

     Tosi, Alessandra; Vellido Alcacena, Alfredo
    European Symposium on Artificial Neural Networks
    Presentation's date: 2012-04
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Model interpretability is a problem of knowledge extraction from the patterns found in raw data. One key source of knowledge is information visualization, which can help us to gain insights into a problem through graphical representations and metaphors. Nonlinear dimensionality reduction techniques can provide flexible visual insight, but the locally varying representation distortion they produce makes interpretation far from intuitive. In this paper, we define a cartogram method, based on techniques of geographic representation, that allows reintroducing this distortion, measured as a magnification factor, in the visual maps of the batch-SOM model. It does so while preserving the topological continuity of the representation.

  • Classifying malignant brain tumours from 1H-MRS data using Breadth Ensemble Learning

     Vilamala Muñoz, Albert; Belanche Muñoz, Luis Antonio; Vellido Alcacena, Alfredo
    International Conference on Neural Networks
    Presentation's date: 2012
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Severe sepsis mortality prediction with logistic regression over latent factors

     Ribas Ripoll, Vicent; Vellido Alcacena, Alfredo; Ruiz Rodriguez, Juan Carlos; Rello Condomines, Jordi
    Expert systems with applications
    Date of publication: 2011-02-01
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Semi-supervised analysis of human brain tumours from partially labeled MRS information, using manifold learning models

     Cruz Barbosa, Raul; Vellido Alcacena, Alfredo
    International journal of neural systems
    Date of publication: 2011-02
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • A variational Bayesian approach for the robust analysis of the cortical silent period from EMG recordings of brain stroke patients

     Olier, Iván; Amengual, Julià; Vellido Alcacena, Alfredo
    Neurocomputing
    Date of publication: 2011-04
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Binary classification of brain tumours using a discrete wavelet transform and energy criteria

     Arizmendi Pereira, Carlos Julio; Vellido Alcacena, Alfredo; Romero Merino, Enrique
    Latin American Symposium on Circuits and Systems
    Presentation's date: 2011-02
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Brain tumour classification using Gaussian decomposition and neural networks

     Arizmendi Pereira, Carlos Julio; Sierra Bueno, Daniel Alfonso; Vellido Alcacena, Alfredo; Romero Merino, Enrique
    IEEE Engineering in Medicine and Biology Society
    Presentation's date: 2011-08-30
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Access to the full text
    A probabilistic approach to the visual exploration of G protein-coupled receptor sequences  Open access

     Vellido Alcacena, Alfredo; Cárdenas, Martha Ivón; Olier Caparroso, Ivan Alberto; Rovira, Xavier; Giraldo, Jesús
    European Symposium on Artificial Neural Networks
    Presentation's date: 2011
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    The study of G protein-coupled receptors (GPCRs) is of great interest in pharmaceutical research, but only a few of their 3D structures are known at present. On the contrary, their amino acid sequences are known and accessible. Sequence analysis can provide new insight on GPCR function. Here, we use a kernel-based statistical machine learning model for the visual exploration of GPCR functional groups from their sequences. This is based on the rich information provided by the model regarding the probability of each sequence belonging to a certain receptor group.

  • Spectral decomposition methods for the analysis of MRS information from human brain tumors

     Ortega Martorell, Sandra; Vellido Alcacena, Alfredo; Lisboa, Paulo J.G.; Julià Sapé, Margarida; Arús, Carles
    International Conference on Neural Networks
    Presentation's date: 2011
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Access to the full text
    Seeing is believing: the importance of visualization in real-world machine learning applications  Open access

     Vellido Alcacena, Alfredo; Martín, José David; Rossi, Fabrice; Lisboa, Paulo J.G.
    European Symposium on Artificial Neural Networks
    Presentation's date: 2011
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    The increasing availability of data sets with a huge amount of information, coded in many diff erent features, justifi es the research on new methods of knowledge extraction: the great challenge is the translation of the raw data into useful information that can be used to improve decisionmaking processes, detect relevant profi les, fi nd out relationships among features, etc. It is undoubtedly true that a picture is worth a thousand words, what makes visualization methods be likely the most appealing and one of the most relevant kinds of knowledge extration methods. At ESANN 2011, the special session "Seeing is believing: The importance of visualization in real-world machine learning applications" reflects some of the main emerging topics in the field. This tutorial prefaces the session, summarizing some of its contributions, while also providing some clues to the current state and the near future of visualization methods within the framework of Machine Learning.

  • Comparative diagnostic accuracy of linear and nonlinear feature extraction methods in a neuro-oncology problem

     Cruz Barbosa, Raul; Bautista Villavicencio, David; Vellido Alcacena, Alfredo
    Mexican Conference on Pattern Recognition
    Presentation's date: 2011
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    The diagnostic classification of human brain tumours on the basis of magnetic resonance spectra is a non-trivial problem in which dimensionality reduction is almost mandatory. This may take the form of feature selection or feature extraction. In feature extraction using manifold learning models, multivariate data are described through a low-dimensional manifold embedded in data space. Similarities between points along this manifold are best expressed as geodesic distances or their approximations. These approximations can be computationally intensive, and several alternative software implementations have been recently compared in terms of computation times. The current brief paper extends this research to investigate the comparative ability of dimensionality-reduced data descriptions to accurately classify several types of human brain tumours. The results suggest that the way in which the underlying data manifold is constructed in nonlinear dimensionality reduction methods strongly influences the classification results.

  • On the use of decision trees for ICU outcome prediction in sepsis patients treated with statins

     Ribas Ripoll, Vicent; Caballero López, Jesús; Ruiz Rodriguez, Juan Carlos; Ruiz Sanmartin, Adolfo; Rello, Jordi; Vellido Alcacena, Alfredo
    IEEE Symposium on Computational Intelligence and Data Mining
    Presentation's date: 2011
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • On the computation of the geodesic distance with an application to dimensionality reduction in a neuro-oncology problem

     Cruz Barbosa, Raul; Bautista Villavicencio, David; Vellido Alcacena, Alfredo
    Iberoamerican Conference on Pattern Recognition
    Presentation's date: 2011
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Manifold learning models attempt to parsimoniously describe multivariate data through a low-dimensional manifold embedded in data space. Similarities between points along this manifold are often expressed as Euclidean distances. Previous research has shown that these similarities are better expressed as geodesic distances. Some problems concerning the computation of geodesic distances along the manifold have to do with time and storage restrictions related to the graph representation of the manifold. This paper provides different approaches to the computation of the geodesic distance and the implementation of Dijkstra’s shortest path algorithm, comparing their performances. The optimized procedures are bundled into a software module that is embedded in a dimensionality reduction method, which is applied to MRS data from human brain tumours. The experimental results show that the proposed implementation explains a high proportion of the data variance with a very small number of extracted features, which should ease the medical interpretation of subsequent results obtained from the reduced datasets.

  • Brain tumor pathological area delimitation through non-negative matrix factorization

     Ortega Martorell, Sandra; Lisboa, Paulo J.G.; Vellido Alcacena, Alfredo; Simoes, Rui V.; Julià Sapé, Margarida; Arús, Carles
    IEEE International Conference On Data Mining
    Presentation's date: 2011
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Severe sepsis mortality prediction with relevance vector machines

     Ribas Ripoll, Vicent; Ruiz Rodriguez, Juan Carlos; Wojdel, Anna; Caballero Lopez, Jesus; Rello Condomines, Jordi; Vellido Alcacena, Alfredo
    IEEE Engineering in Medicine and Biology Society
    Presentation's date: 2011
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Feature and model selection with discriminatory visualization for diagnostic classification of brain tumors

     González Navarro, Félix Fernando; Belanche Muñoz, Luis Antonio; Romero Merino, Enrique; Vellido Alcacena, Alfredo; Julià Sapé, Margarida; Arús, Carles
    Neurocomputing
    Date of publication: 2010-10
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Access to the full text
    Data mining in cancer research  Open access

     Lisboa, Paulo J.G.; Vellido Alcacena, Alfredo; Tagliaferri, Roberto; Napolitano, Francesco; Ceccarelli, Michelle; Martin Guerrero, Jose D.; Biganzoli, Elia
    IEEE computational intelligence magazine
    Date of publication: 2010-02
    Journal article

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    This article is not intended as a comprehensive survey of data mining applications in cancer. Rather, it provides starting points for further, more targeted, literature searches, by embarking on a guided tour of computational intelligence applications in cancer medicine, structured in increasing order of the physical scales of biological processes.

  • Semi-supervised geodesic Generative Topographic Mapping

     Cruz Barbosa, Raul; Vellido Alcacena, Alfredo
    Pattern recognition letters
    Date of publication: 2010-02
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • AIDTUMOUR: HERRAMIENTAS BASADAS EN METODOS DE INTELIGENCIA ARTIFICIAL PARA EL APOYO A LA DECISION EN

     Nebot Castells, Maria Angela; Mugica Alvarez, Francisco José; Belanche Muñoz, Luis Antonio; Vellido Alcacena, Alfredo
    Participation in a competitive project

     Share

  • Clustering educational data

     Vellido Alcacena, Alfredo; Castro, Félix; Nebot Castells, Maria Angela
    Date of publication: 2010
    Book chapter

    View View Open in new window  Share Reference managers Reference managers Open in new window