In computer music fields, such as algorithmic composition and live coding, the aural exploration of parameter combinations is the process through which systems' capabilities are learned and the material for different musical tasks is selected and classified. Despite its importance, few models of this process have been proposed. Here, a rule extraction algorithm is presented. It works with data obtained during a user auditory exploration of parameters, in which specific perceptual categories are searched. The extracted rules express complex, but general relationships, among parameter values and categories. Its formation is controlled by functions that govern the data grouping. These are given by the user through heuristic considerations. The rules are used to build two more general models: a set of "extended or Inference Rules" and a fuzzy classifier which allow the user to infer unheard combinations of parameters consistent with the preselected categories from the extended rules and between the limits of the explored parameter space, respectively. To evaluate the models, user tests were performed. The constructed models allow to reduce complexity in operating the systems, by providing a set of "presets" for different categories, and extend compositional capacities through the inferred combinations, alongside a structured representation of the information.
In computer music fields, such as algorithmic composition and live coding, the aural exploration of parameter combinations is the process through which systems’ capabilities are learned and the material for different musical tasks is selected and classified. Despite its importance, few models of this process have been proposed. Here, a rule extraction algorithm is presented. It works with data obtained during a user auditory exploration of parameters, in which specific perceptual categories are searched. The extracted rules express complex, but general relationships, among parameter values and categories. Its formation is controlled by functions that govern the data grouping. These are given by the user through heuristic considerations. The rules are used to build two more general models: a set of “extended or Inference Rules” and a fuzzy classifier which allow the user to infer unheard combinations of parameters consistent with the preselected categories from the extended rules and between the limits of the explored parameter space, respectively. To evaluate the models, user tests were performed. The constructed models allow to reduce complexity in operating the systems, by providing a set of “presets” for different categories, and extend compositional capacities through the inferred combinations, alongside a structured representation of the information.
Twitter has become one of the most popular Location-based Social Networks (LBSNs) that bridges physical and virtual worlds. Tweets, 140-character-long messages, are aimed to give answer to the What’s happening? question. Occurrences and events in the real life (such as political protests, music concerts, natural disasters or terrorist acts) are usually reported through geo-located tweets by users on site. Uncovering event-related tweets from the rest is a challenging problem that necessarily requires exploiting different tweet features. With that in mind, we propose Tweet-SCAN, a novel event discovery technique based on the popular density-based clustering algorithm called DBSCAN. Tweet-SCAN takes into account four main features from a tweet, namely content, time, location and user to group together event-related tweets. The proposed technique models textual content through a probabilistic topic model called Hierarchical Dirichlet Process and introduces Jensen–Shannon distance for the task of neighborhood identification in the textual dimension. As a matter of fact, we show Tweet-SCAN performance in two real data sets of geo-located tweets posted during Barcelona local festivities in 2014 and 2015, for which some of the events were identified by domain experts beforehand. Through these tagged data sets, we are able to assess Tweet-SCAN capabilities to discover events, justify using a textual component and highlight the effects of several parameters.
Freezing of gait (FoG) is one of the most disturbing and incapacitating symptoms in Parkinson's disease. It is defined as a sudden block in effective stepping, provoking anxiety, stress and falls. FoG is usually evaluated by means of different questionnaires; however, this method has shown to be not reliable, since it is subjective due to its dependence on patients’ and caregivers’ judgment. Several authors have analyzed the usage of MEMS inertial systems to detect FoG with the aim of objectively evaluating it. So far, specific methods based on accelerometer's frequency response has been employed in many works; nonetheless, since they have been developed and tested in laboratory conditions, their performance is commonly poor when being used at patients’ home. Therefore, this work proposes a new set of features that aims to detect FoG in real environments by using accelerometers. This set of features is compared with three previously reported approaches to detect FoG. The different feature sets are trained by means of several machine learning classifiers; furthermore, different window sizes are also evaluated. In addition, a greedy subset selection process is performed to reduce the computational load of the method and to enable a real-time implementation. Results show that the proposed method detects FoG at patients’ home with 91.7% and 87.4% of sensitivity and specificity, respectively, enhancing the results of former methods between a 5% and 11% and providing a more balanced rate of true positives and true negatives.
The main goal of this work is to develop a methodology for finding nutritional patterns from a variety of individual characteristics which can contribute to better understand the interactions between nutrition and health, provided that the complexity of the phenomenon gives poor performance using classical approaches. An innovative methodology based on a combination of advanced clustering techniques and consistent conceptual interpretation of clusters is proposed to find more understandable patterns or clusters. The Interpreted Integrative Multiview Clustering (I2MC) combines the previously proposed Integrative Multiview Clustering (IMC) with a new interpretation methodology NCIMS. IMC uses crossing operations over the several partitions obtained with the different views. Comparison with other classical clustering techniques is provided to assess the performance of this approach. IMC helps to reduce the high dimensionality of the data based on multiview division of variables. Two innovative Cluster Interpretation methodologies are proposed to support the understanding of the clusters. These are automatic methods to detect the significant variables that describe the clusters; also, a mechanism to deal with the consistency between the interpretations inter clusters of a single partition CI-IMS, or between pairs of nested partitions NCIMS. Some formal concepts are specifically introduced to be used in the NCIMS. I2MC is used to validate the interpretability of the participant’s profiles from an intervention nutritional study. The method has advantages to deal with complex datasets including heterogeneous variables corresponding to different topics and is able to provide meaningful partitions.
Assigning papers to reviewers is a large, long and difficult task for conference chairs and scientific committees. The paper reviewer assignment problem is a multi-agent problem which requires understanding reviewer expertise and paper topics for the matching process. This paper proposes to elaborate on some features used to compute reviewer expertise and aggregate multiple factors to find the fittest combination of reviewers for each paper. Expertise information is gathered implicitly from publicly available information and a reviewer profile is generated automatically. An Ordered Weighted Averaging (OWA) aggregation function is used to summarize information coming from different sources and rank candidate reviewers for each paper. General constraints for the Reviewer Assignment Problem (RAP) have been incorporated into a real case example: (i) conflicts of interest between a reviewer and authors should be avoided, (ii) each paper must have a minimum number of reviewers, and (iii) each reviewer load cannot exceed a certain number of papers.
Martin, M.; Bejar, J.; Esposito, G.; Català Roig, N.; Cortes, U.; Viñas, F.; Tarragó, J.; Rojo, E.; Nowak, R. Pattern recognition letters p. 172-181 DOI: 10.1016/j.patrec.2016.07.018 Data de publicació: 2016-08-02 Article en revista
The current wide access to data from different neuroimaging techniques has permitted to obtain data to explore the possibility of finding objective criteria that can be used for diagnostic purposes. In order to decide which features of the data are relevant for the diagnostic task, we present in this paper a simple method for feature selection based on kernel alignment with the ideal kernel in support vector machines (SVM). The method presented shows state-of-the-art performance while being more efficient than other methods for feature selection in SVM. It is also less prone to overfitting due to the properties of the alignment measure. All these abilities are essential in neuroimaging study, where the number of features representing recordings is usually very large compared with the number of recordings. The method has been applied to a dataset in order to determine objective criteria for the diagnosis of schizophrenia. The dataset analyzed has been obtained from multichannel magnetoencephalogram (MEG) recordings, corresponding to the recordings during the performance of a mismatch negativity (MMN) auditory task by a set of schizophrenia patients and a control group. All signal frequency bands are analyzed (from d (1–4 Hz) to high frequency ¿ (60–200 Hz)) and the signal correlations among the different sensors for these frequencies are used as features.
This paper presents a methodology to address lexical disambiguation in a standard phrase-based statistical machine translation system. Similarity among source contexts is used to select appropriate translation units. The information is introduced as a novel feature of the phrase-based model and it is used to select the translation units extracted from the training sentence more similar to the sentence to translate. The similarity is computed through a deep autoencoder representation, which allows to obtain effective low-dimensional embedding of data and statistically significant BLEU score improvements on two different tasks (English-to-Spanish and English-to-Hindi). (C) 2016 Elsevier B.V. All rights reserved.
Huerta, I.; Fernandez, C.; Segura, C.; Hernando, J.; Prati, A. Pattern recognition letters Vol. 68, num. 2, p. 239-249 DOI: 10.1016/j.patrec.2015.06.006 Data de publicació: 2015-12-15 Article en revista
The automatic estimation of age from face images is increasingly gaining attention, as it facilitates applications including advanced video surveillance, demographic statistics collection, customer profiling, or search optimization in large databases. Nevertheless, it becomes challenging to estimate age from uncontrollable environments, with insufficient and incomplete training data, dealing with strong person-specificity and high within-range variance. These difficulties have been recently addressed with complex and strongly hand-crafted descriptors, difficult to replicate and compare. This paper presents two novel approaches: first, a simple yet effective fusion of descriptors based on texture and local appearance; and second, a deep learning scheme for accurate age estimation. These methods have been evaluated under a diversity of settings, and the extensive experiments carried out on two large databases (MORPH and FRGC) demonstrate state-of-the-art results over previous work.
A new formulation of the central ideas of Boden's well-established theory on combinational, exploratory and transformational creativity is presented. This new formulation, based on the idea of conceptual space, redefines some terms and includes several types of concept properties (appropriateness and relevance), whose relationship facilitates the computational implementation of the transformational creativity mechanism. The presented formulation is applied to a real case of chocolate designing in which a novel and flavorful combination of chocolate and fruit is generated. The experimentation was conducted jointly with a Spanish chocolate chef. Experimental results prove the relationship between appropriateness and relevance in different frameworks and show that the formulation presented is not only useful for understanding how the creative mechanisms of design works but also facilitates its implementation in real cases to support creativity processes.
A system for learning and executing gestures in a humanoid robot has been developed and implemented in this work. Gestures are represented via the use of dynamical movement primitives on the robotic platform REEM. Since agnostic knowledge is considered when designing trajectories, our approach can be easily extended to other robots. Implemented work involves recording of gestures using three different procedures, from the Own robot, with the help of a user, and from external devices. Next, the dynamic movement primitives representing the motions are generated to describe trajectories that will finally be executed on the real humanoid robot. Several experiments are provided illustrating how knowledge is acquired by the robot, represented in the form of dynamical systems, generalized and reproduced from different starting conditions.
Fornell, A.; Rodrigo, Z.; Rovira, X.; Sanchez, M.; Santoma, R.; Teixidor, F.; Golobardes, E. Pattern recognition letters Vol. 67, num. Part1, p. 39-48 DOI: 10.1016/j.patrec.2015.05.013 Data de publicació: 2015-12-01 Article en revista
The concept mapping methodology aims to respond to the non trivial task of conceptualising abstract thoughts by means of a focus group composed by experts from the studied domain. The approach defines a set of general steps that allow experts to lead the generation of ideas, group the ideas in a conceptual map of interrelated concepts using clustering multidimensional scaling and clustering techniques, analysing the quality of the conceptual maps and deciding on a final interpretation. In this sense, this final decision is not trivial because clustering techniques provide a set of potentially conceptual maps so experts must select the one that fits best according to their opinion. For this reason, we present the global index of consensus as an indicator for filtering the most suitable clustering solutions using qualitative reasoning. It promotes the consensus of experts opinions and ensures objectivity in the final interpretation. The index outperforms three of the most well-known clustering validation indexes in a case study focused on the meaning of excellence in the hospitality industry.; This work presents the global index of consensus as an indicator for filtering the most suitable clustering solutions using qualitative reasoning that promotes the consensus of experts' opinions, which is one of the key aspects in the concept mapping methodology. The index outperforms three of the most well-known clustering validation indexes in a case study focused on the meaning of excellence in hospitality. (C) 2015 Elsevier B.V. All rights reserved.
What is the role that colour plays in perception of a brand by customers? How can we explore the cognitive
role that colour plays in determining brand perception? To answer these questions we propose a preference
disaggregation method based on multi-criteria decision aid. We identify the criteria aggregation model that
underlies the global preference of a brand with respect to each brand image attribute. The proposed method is
inspired by the well-known UTASTAR algorithm, but unlike the original formulation, it represents preferences
by means of non-monotonic value functions. The method is applied to a database of brands ranked on each
brand image attribute. For each brand image attribute, non-monotonic marginal value functions from each
component of the brand colour are obtained separately. These functions contain the fitness between
each colour component and each brand image attribute, in an understandable manner.
In this work, an image representation based on Binary Partition Tree is proposed for object detection in hyperspectral images. This hierarchical region-based representation can be interpreted as a set of hierarchical regions stored in a tree structure, which succeeds in presenting: (i) the decomposition of the image in terms of coherent regions and (ii) the inclusion relations of the regions in the scene. Hence, the BPT representation defines a search space for constructing a robust object identification scheme. Spatial and spectral information are integrated in order to analyze hyperspectral images with a region based perspective. For each region represented in the BPT, spatial and spectral descriptors are computed and the likelihood that they correspond to an instantiation of the object of interest is evaluated. Experimental results demonstrate the good performances of this BPT-based approach. (C) 2015 Elsevier B.V. All rights reserved.
In this paper we present a new algorithm for filtering a grey-level image using as attribute the number of holes of its connected components. Our approach is based on the max-tree data structure, that makes it possible to implement an attribute filtering of the image with linear computational cost.; To determine the number of holes, we present a set of diverse pixel patterns. These patterns are designed in a way that the number of holes can be computed recursively, this means that the calculations done for the components of the image can be inherited by their parent nodes of the max-tree. Since we do not need to re-calculate the attribute data for all connected components of the image, the computation time devoted to the attribute computation remains linear.
Hernández-Vela, A.; Bautista, M.; Pérez, X.; Ponce, V.; Escalera, S.; Xavier, B.; Pujol, O.; Angulo, C. Pattern recognition letters Vol. 50, p. 112-121 DOI: 10.1016/j.patrec.2013.09.009 Data de publicació: 2014-12-01 Article en revista
We present a methodology to address the problem of human gesture segmentation and recognition in video and depth image sequences. A Bag-of-Visual-and-Depth-Words (BoVDW) model is introduced as an extension of the Bag-of-Visual-Words (BoVW) model. State-of-the-art RGB and depth features, including a newly proposed depth descriptor, are analysed and combined in a late fusion form. The method is integrated in a Human Gesture Recognition pipeline, together with a novel probability-based Dynamic
Time Warping (PDTW) algorithm which is used to perform prior segmentation of idle gestures. The proposed DTW variant uses samples of the same gesture category to build a Gaussian Mixture Model driven probabilistic model of that gesture class. Results of the whole Human Gesture Recognition pipeline in a public data set show better performance in comparison to both standard BoVW model and DTW approach.
Human motion prediction in indoor and outdoor scenarios is a key issue towards human robot interaction and intelligent robot navigation in general. In the present work, we propose a new human motion intentionality indicator, denominated Bayesian Human Motion Intentionality Prediction (BHMIP), which is a geometric-based long-term predictor. Two variants of the Bayesian approach are proposed, the Sliding Window BHMIP and the Time Decay BHMIP. The main advantages of the proposed methods are: a simple formulation, easily scalable, portability to unknown environments with small learning effort, low computational complexity, and they outperform other state of the art approaches. The system only requires training to obtain the set of destinations, which are salient positions people normally walk to, that configure a scene. A comparison of the BHMIP is done with other well known methods for long-term prediction using the Edinburgh Informatics Forum pedestrian database and the Freiburg People Tracker database. (C) 2013 Elsevier B.V. All rights reserved.
The medical analysis of human brain tumours commonly relies on indirect measurements. Among these, magnetic resonance imaging (MRI) and spectroscopy (MRS) predominate in clinical settings as tools for diagnostic assistance. Pattern recognition (PR) methods have successfully been used in this task, usually interpreting diagnosis as a supervised classification problem. In MRS, the acquired spectral signal can be analyzed in an unsupervised manner to extract its constituent sources. Recently, this has been successfully accomplished using Non-negative Matrix Factorization (NMF) methods. In this paper, we present a method to introduce the available class information into the unsupervised source extraction process of a convex variant of NMF. Novel techniques to generate diagnostic predictions for new, unseen spectra using the proposed Discriminant Convex-NMF are also described and experimentally assessed.
The use of binary support vector machines (SVMs) in multi-classification is addressed in this paper. Margins
associated to the bi-classifiers, since they depend on the geometrical disposition of the classes being
separated, are, in general, of various magnitudes. In order to overcome this scaling problem, a normalization
process should be applied on the SVMs’ outputs. Thus, a new normalization approach is presented
based on the convex hulls that contain the classes to be separated. Furthermore, a theoretical study is
developed which justifies the proposed approach, and an interpretation is provided. An empirical study
is also carried out to compare this normalization with others found in the literature.
In structural pattern recognition the median string has been established as a useful tool to represent a set of strings. However, its exact computation is complex and of high computational burden. In this paper we propose a new approach for the computation of median string based on string embedding. Strings are embedded into a vector space and the median is computed in the vector domain. We apply three different inverse transformations to go from the vector domain back to the string domain in order to obtain a final approximation of the median string. All of them are based on the weighted mean of a pair of strings. Experiments show that we succeed to compute good approximations of the median string.
Median graphs have been presented as a useful tool for capturing the essential information of a set of graphs. Nevertheless, computation of optimal solutions is a very hard problem. In this work we present a new and more efficient optimal algorithm for the median graph computation. With the use of a particular cost function that permits the definition of the graph edit distance in terms of the maximum common subgraph, and a prediction function in the backtracking algorithm, we reduce the size of the search space, avoiding the evaluation of a great amount of states and still obtaining the exact median. We present a set of experiments comparing our new algorithm against the previous existing exact algorithm using synthetic data. In addition, we present the first application of the exact median graph computation to real data and we compare the results against an approximate algorithm based on genetic search. These experimental results show that our algorithm outperforms the previous existing exact algorithm and in addition show the potential applicability of the exact solutions to real problems.
The relationship between two important problems in pattern recognition using attributed relational graphs, the maximum common subgraph and the minimum common supergraph of two graphs, is established by means of simple constructions, which allow to obtain the maximum common subgraph from the minimum common supergraph, and vice versa. On this basis, a new graph distance metric is proposed for measuring similarities between objects represented by attributed relational graphs. The proposed metric can be computed by a straightforward extension of any algorithm that implements error-correcting graph matching, when run under an appropriate cost function, and the extension only takes time linear in the size of the graphs.
Many thinning algorithms for 2D binary images, or modifications of existing ones, have been proposed in recent years. The one given herein is surprisingly simple compared to most of them, and still it has theoretically favorable properties. Actually, it provides a connected - single-pixel in width whenever possible - well-centered homototic skeleton that allows shapes to be nearly reconstructed. In addition to these properties, it is also very attractive because of its generalization to higher dimensions. The presented algorithm is based on the application of directional erosions, while retaining those pixels that introduce disconnections. It is shown how this strategy is specially well-suited for run-length encoded images, leading to a very fast and simple thinning algorithm.