Graphic summary
  • Show / hide key
  • Information


Scientific and technological production
  •  

1 to 26 of 26 results
  • Characterizing functional dependencies in formal concept analysis with pattern structures

     Baixeries i Juvillà, Jaume; Kaytoue, Mehdi; Napoli, Amedeo
    Annals of mathematics and artificial intelligence
    Date of publication: 2014-01-29
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Computing functional dependencies from a relation is an important database topic, with many applications in database management, reverse engineering and query opti- mization. Whereas it has been deeply investigated in those fields, strong links exist with the mathematical framework of Formal Concept Analysis. Considering the discovery of functional dependencies, it is indeed known that a relation can be expressed as the binary relation of a formal context, whose implications are equivalent to those dependencies. How- ever, this leads to a new data representation that is quadratic in the number of objects w.r.t. the original data. Here, we present an alternative avoiding such a data representation and show how to characterize functional dependencies using the formalism of pattern structures, an extension of classical FCA to handle complex data. We also show how another class of dependencies can be characterized with that framework, namely, degenerated multivalued dependencies. Finally, we discuss and compare the performances of our new approach in a series of experiments on classical benchmark datasets

  • The challenges of statistical patterns of language: the case of Menzerath's law in genomes

     Ferrer Cancho, Ramon; Forns, Núria; Hernandez Fernandez, Antonio; Bel Enguix, Gemma; Baixeries i Juvillà, Jaume
    Complexity
    Date of publication: 2013-01
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • The parameters of Menzerath-Altmann law in genomes

     Baixeries i Juvillà, Jaume; Hernandez Fernandez, Antonio; Forns, Núria; Ferrer Cancho, Ramon
    Journal of quantitative linguistics
    Date of publication: 2013
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    The relationship between the size of the whole and the size of the parts in language and music is known to follow the Menzerath-Altmann law at many levels of description (morphemes, words, sentences, …). Qualitatively, the law states that the larger the whole, the smaller its parts, e.g. the longer a word (in syllables) the shorter its syllables (in letters or phonemes). This patterning has also been found in genomes: the longer a genome (in chromosomes), the shorter its chromosomes (in base pairs). However, it has been argued recently that mean chromosome length is trivially a pure power function of chromosome number with an exponent of -1. The functional dependency between mean chromosome size and chromosome number in groups of organisms from three different kingdoms is studied. The fit of a pure power function yields exponents between -1.6 and 0.1. It is shown that an exponent of -1 is unlikely for fungi, gymnosperm plants, insects, reptiles, ray-finned fishes and amphibians. Even when the exponent is very close to -1, adding an exponential component is able to yield a better fit with regard to a pure power-law in plants, mammals, ray-finned fishes and amphibians. The parameters of the Menzerath-Altmann law in genomes deviate significantly from a power law with a -1 exponent with the exception of birds and cartilaginous fishes.

  • Erratum to "Random models of Menzerath-Altmann law in genomes" (BioSystems 107(3) (2012) 167¿173)

     Ferrer Cancho, Ramon; Baixeries i Juvillà, Jaume; Hernandez Fernandez, Antonio
    Biosystems
    Date of publication: 2013-03
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Here we improve the mathematical arguments of Baixeries et al (BioSystems 107(3) (2012) 167¿173). The corrections do not alter the conclusion that the random breakage model yields an insufficient fit to the scaling of mean chromosome length as a function of chromosome number in real genomes.

  • Access to the full text
    The Evolution of the exponent of Zipf's law in language ontogeny  Open access

     Baixeries i Juvillà, Jaume; Elvevag, Brita; Ferrer Cancho, Ramon
    PLoS One
    Date of publication: 2013-03-13
    Journal article

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    It is well-known that word frequencies arrange themselves according to Zipf's law. However, little is known about the dependency of the parameters of the law and the complexity of a communication system. Many models of the evolution of language assume that the exponent of the law remains constant as the complexity of a communication systems increases. Using longitudinal studies of child language, we analysed the word rank distribution for the speech of children and adults participating in conversations. The adults typically included family members (e.g., parents) or the investigators conducting the research.

    It is well-known that word frequencies arrange themselves according to Zipf's law. However, little is known about the dependency of the parameters of the law and the complexity of a communication system. Many models of the evolution of language assume that the exponent of the law remains constant as the complexity of a communication systems increases. Using longitudinal studies of child language, we analysed the word rank distribution for the speech of children and adults participating in conversations. The adults typically included family members (e.g., parents) or the investigators conducting the research. Our analysis of the evolution of Zipf's law yields two main unexpected results. First, in children the exponent of the law tends to decrease over time while this tendency is weaker in adults, thus suggesting this is not a mere mirror effect of adult speech. Second, although the exponent of the law is more stable in adults, their exponents fall below 1 which is the typical value of the exponent assumed in both children and adults. Our analysis also shows a tendency of the mean length of utterances (MLU), a simple estimate of syntactic complexity, to increase as the exponent decreases. The parallel evolution of the exponent and a simple indicator of syntactic complexity (MLU) supports the hypothesis that the exponent of Zipf's law and linguistic complexity are inter-related. The assumption that Zipf's law for word ranks is a power-law with a constant exponent of one in both adults and children needs to be revised.

  • The exponent of Zipf¿s law in language ontogeny.

     Baixeries i Juvillà, Jaume; Ferrer Cancho, Ramon
    The Evolution of Language
    Presentation's date: 2012-03-13
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Access to the full text
    Computing Functional Dependencies with Pattern Structures  Open access

     Baixeries i Juvillà, Jaume; Kaytoue, Mehdi; Napoli, Amedeo
    Concept Lattices and Their Applications
    Presentation's date: 2012-10-12
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    The treatment of many-valued data with FCA has been achieved by means of scaling. This method has some drawbacks, since the size of the resulting formal contexts depends usually on the number of di erent values that are present in a table, which can be very large. Pattern structures have been proved to deal with many-valued data, offering a viable and sound alternative to scaling in order to represent and analyze sets of many-valued data with FCA. Functional dependencies have already been dealt with FCA using the binarization of a table, that is, creating a formal context out of a set of data. Unfortunately, although this method is standard and simple, it has an important drawback, which is the fact that the resulting context is quadratic in number of objects w.r.t. the original set of data. In this paper, we examine how we can extract the functional dependencies that hold in a set of data using pattern structures. This allows to build an equivalent concept lattice avoiding the step of binarization, and thus comes with better concept representation and computation.

  • MINERIA EN DATOS BIOLOGICOS Y SOCIALES: ALGORITMOS, TEORIA E IMPLEMENTACION

     Morrill, Glyn Verden; Quattoni, Ariadna Julieta; Arratia Quesada, Argimiro Alejandro; De Balle Pigem, Borja; Arias Vicente, Marta; Casas Fernandez, Bernardino; Bifet Figuerol, Albert Carles; Berral Garcia, Josep Lluis; Lopez Herrera, Josefina; Baixeries i Juvillà, Jaume; Delgado Pin, Jordi; Belanche Muñoz, Luis Antonio; Castro Rabal, Jorge; Lozano Bojados, Antoni; Ferrer Cancho, Ramon; Sierra Santibañez, Maria Josefina; Gavaldà Mestre, Ricard
    Participation in a competitive project

     Share

  • Access to the full text
    A New Formal Context for Symmetric Dependencies  Open access

     Baixeries i Juvillà, Jaume
    Concept Lattices and Their Applications
    Presentation's date: 2011-10-18
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In this paper we present a new formal context for symmetric dependencies. We study its properties and compare it with previous approaches. We also discuss how this new context may open the door to solve some open problems for symmetric dependencies.

  • Access to the full text
    Size of the whole versus number of parts in genomes  Open access

     Hernandez Fernandez, Antonio; Baixeries i Juvillà, Jaume; Forns, Núria; Ferrer Cancho, Ramon
    Entropy: international and interdisciplinary journal of entropy and information studies
    Date of publication: 2011-08
    Journal article

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    It is known that chromosome number tends to decrease as genome size increases in angiosperm plants. Here the relationship between number of parts (the chromosomes) and size of the whole (the genome) is studied for other groups of organisms from different kingdoms. Two major results are obtained. First, the finding of relationships of the kind "the more parts the smaller the whole" as in angiosperms, but also relationships of the kind "the more parts the larger the whole". Second, these dependencies are not linear in general. The implications of the dependencies between genome size and chromosome number are two-fold. First, they indicate that arguments against the relevance of the finding of negative correlations consistent with Menzerath-Altmann law (a linguistic law that relates the size of the parts with the size of the whole) in genomes are seriously flawed. Second, they unravel the weakness of a recent model of chromosome lengths based upon random breakage that assumes that chromosome number and genome size are independent.

  • Access to the full text
    Random models of Menzerath-Altmann law in genomes  Open access

     Baixeries i Juvillà, Jaume; Hernandez Fernandez, Antonio; Ferrer Cancho, Ramon
    Biosystems
    Date of publication: 2011-12-16
    Journal article

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Recently, a random breakage model has been proposed to explain the negative correlation between mean chromosome length and chromosome number that is found in many groups of species and is consistent with Menzerath–Altmann law, a statistical law that defines the dependency between the mean size of the whole and the number of parts in quantitative linguistics. Here, the central assumption of the model, namely that genome size is independent from chromosome number is reviewed. This assumption is shown to be unrealistic from the perspective of chromosome structure and the statistical analysis of real genomes. A general class of random models, including that random breakage model, is analyzed. For any model within this class, a power law with an exponent of −1 is predicted for the expectation of the mean chromosome size as a function of chromosome length, a functional dependency that is not supported by real genomes. The random breakage and variants keeping genome size and chromosome number independent raise no serious objection to the relevance of correlations consistent with Menzerath–Altmann law across taxonomic groups and the possibility of a connection between human language and genomes through that law.

    Postprint (author’s final draft)

  • Access to the full text
    Yet a Faster Algorithm for Building the Hasse Diagram of a Galois Lattice  Open access

     Baixeries i Juvillà, Jaume; Szathmary, Laszlo; Valtchev, Petko; Godin, Robert
    The Seventh International Conference in Formal Concept Analysis 2009
    Presentation's date: 2009-05-22
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Formal concept analysis (FCA) is increasingly applied to data mining problems, essentially as a formal framework for mining reduced representations (bases) of target pattern families. Yet most of the FCA-based miners, closed pattern miners, would only extract the patterns themselves out of a dataset, whereas the generality order among patterns would be required for many bases. As a contribution to the topic of the (precedence) order computation on top of the set of closed patterns, we present a novel method that borrows its overall incremental approach from two algorithms in the literature. The claimed innovation consists of splitting the update of the precedence links into a large number of lower-cover list computations (as opposed to a single uppercover list computation) that unfold simultaneously. The resulting method shows a good improvement with respect to its counterpart both on its theoretical complexity and on its practical performance. It is therefore a good starting point for the design of efficient and scalable precedence miners.

  • LARCA

     Morrill, Glyn Verden; Baixeries i Juvillà, Jaume; Castro Rabal, Jorge; Delgado Pin, Jordi; Arias Vicente, Marta; Lopez Herrera, Josefina; Bifet Figuerol, Albert Carles; Sierra Santibañez, Maria Josefina; Balcazar Navarro, Jose Luis; Berral Garcia, Josep Lluis; Quattoni, Ariadna Julieta; Arratia Quesada, Argimiro Alejandro; De Balle Pigem, Borja; Gavaldà Mestre, Ricard
    Participation in a competitive project

     Share

  • SECUENCIAS SIMBOLICAS:ANALISIS,APRENDIZAJE,MINERIA Y EVOLUCION - BARCELONA

     Bifet Figuerol, Albert Carles; Sierra Santibañez, Maria Josefina; Baixeries i Juvillà, Jaume; Lozano Bojados, Antoni; Morrill, Glyn Verden; Berral Garcia, Josep Lluis; Arratia Quesada, Argimiro Alejandro; Delgado Pin, Jordi; Arias Vicente, Marta; Lopez Herrera, Josefina; Ferrer Cancho, Ramon; Quattoni, Ariadna Julieta; Gavaldà Mestre, Ricard
    Participation in a competitive project

     Share

  • Formal Context for Symmetric Dependencies

     Baixeries i Juvillà, Jaume
    International Conference on Formal Concept Analysis
    Presentation's date: 2008-02
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Lattice Characterization of Armstrong and Symmetric Dependencies  awarded activity

     Baixeries i Juvillà, Jaume
    Defense's date: 2007-07-09
    Department of Software, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Unified characterization of symmetric dependencies with lattices

     Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis
    International Conference on Formal Concept Analysis
    Presentation's date: 2006-02-14
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Symmetric dependencies, or MultiValued Dependencies-like, are those dependencies that follow the deduction rules of MultiValued Dependencies (MVD’s). These are MVD-clauses, Degenerate MultiValued Dependencies (DMVD’s), and MVD’s, being the latter ones relevant to the 4th normal form in the relational database model. Previous results have explained how to characterize these dependencies with lattices. However, these characterizations were ad-hoc, and no unified characterization has been provided yet. The purpose of this paper is to present such a common characterization for all these kind of dependencies providing the same framework for all of them, and to extend this generalization to the construction of Armstrong relations.

  • A Lattice Representation of Relations, Multivalued Dependencies and Armstrong Relations

     Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis
    International Conference on Conceptual Structures
    Presentation's date: 2005-07
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    We present a lattice-base formalism to relate, in a novel way, different representation methods for relational data. Specifically, relations given extensionally by tuples, multivalued dependencies, and Armstrong relations for multivalued dependencies. The semantics of this formalism is based on a closure operator used to calculated the lattice. We prove that this representation yields the set of multivalued dependencies that hold in a set of tuples as well as Armstrong relation.

  • Access to the full text
    New Closure Operators and Lattice Representations for Multivalued Dependencies and Related Expressions  Open access

     Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis
    Concept Lattices and Their Applications
    Presentation's date: 2005-09
    Presentation of work at congresses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    In Database Theory, Multivalued Dependencies are the main tool to define the Fourth Normal Form and, as such, their inference problem has been deeply studied; two related notions appearing in that study are a syntactical analog in propositional logic and a restriction that maintains to this logic the same relationship as Functional Dependencies do to Horn logic. We present semantic, lattice-theoretic characterizations of such multivalued dependencies that hold in a given relation, as well as similar results for the related notions just mentioned. Our characterizations explain better some previously known facts by providing a unifying framework that is also consistent with the studies of Functional Dependencies.

  • Characterization and Armstrong relations for Degenerated Multivalued Dependencies using Formal Concept Analysis

     Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis
    International Conference on Formal Concept Analysis
    Presentation's date: 2005-02-15
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Functional dependencies, a notion originated in Relational Database Theory, are known to admit interesting characterizations in terms of Formal Concept Analysis. In database terms, two successive, natural extensions of the notion of functional dependency are the socalled degenerate multivalued dependencies, and multivalued dependencies proper. We propose here a new Galois connection, based on any given relation, which gives rise to a formal concept lattice corresponding precisely to the degenerate multivalued dependencies that hold in the relation given. The general form of the construction departs significantly from the most usual way of handling functional dependencies. Then, we extend our approach so as to extract Armstrong relations for the degenerate multivalued dependencies from the concept lattice obtained; the proof of the correctness of this construction is nontrivial.

  • Characterizations of multivalued dependencies and related expressions

     Balcazar Navarro, Jose Luis; Baixeries i Juvillà, Jaume
    International Conference on Discovery Science
    Presentation's date: 2004-10
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Sampling Strategies for Finding Frequent Sets

     Baixeries i Juvillà, Jaume; Casas Garriga, Gemma
    Journées francophones d?Extraction et de Gestion des Connaissances (EGC'03)
    Presentation's date: 2003
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Discrete Deterministic Data Mining as Knowledge Compilation

     Balcazar Navarro, Jose Luis; Baixeries i Juvillà, Jaume
    SIAM/IFIP: Workshop on Discrete Mathematics & Data Mining: DM&DM
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Using concept lattices to mine functional dependencies

     Baixeries i Juvillà, Jaume
    Date: 2003-05
    Report

     Share Reference managers Reference managers Open in new window

  • A Best-First Strategy for Finding Frequent Sets

     Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis; Casas Garriga, Gemma
    Journées francophones d?Extraction et de Gestion des Connaissances
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • A Faster Algorithm for Finding Frequent Sets

     Baixeries i Juvillà, Jaume; Casas Garriga, Gemma; Balcazar Navarro, Jose Luis
    Date: 2001-04
    Report

     Share Reference managers Reference managers Open in new window