 Research group
 LARCA  Laboratory of Relational Algorithmics, Complexity and Learnability
 Department
 Department of Software
 School
 Vilanova i la Geltrú School of Engineering (EPSEVG)
 jbaixerlsi.upc.edu
 Contact details
 UPC directory
Scientific and technological production


Characterizing functional dependencies in formal concept analysis with pattern structures
Baixeries i Juvillà, Jaume; Kaytoue, Mehdi; Napoli, Amedeo
Annals of mathematics and artificial intelligence
Date of publication: 20140129
Journal article
Read the abstract View Share Reference managersComputing functional dependencies from a relation is an important database topic, with many applications in database management, reverse engineering and query opti mization. Whereas it has been deeply investigated in those fields, strong links exist with the mathematical framework of Formal Concept Analysis. Considering the discovery of functional dependencies, it is indeed known that a relation can be expressed as the binary relation of a formal context, whose implications are equivalent to those dependencies. How ever, this leads to a new data representation that is quadratic in the number of objects w.r.t. the original data. Here, we present an alternative avoiding such a data representation and show how to characterize functional dependencies using the formalism of pattern structures, an extension of classical FCA to handle complex data. We also show how another class of dependencies can be characterized with that framework, namely, degenerated multivalued dependencies. Finally, we discuss and compare the performances of our new approach in a series of experiments on classical benchmark datasets 
The challenges of statistical patterns of language: the case of Menzerath's law in genomes
Ferrer Cancho, Ramon; Forns, Núria; Hernandez Fernandez, Antonio; Bel Enguix, Gemma; Baixeries i Juvillà, Jaume
Complexity
Date of publication: 201301
Journal article
View Share Reference managers 
The parameters of MenzerathAltmann law in genomes
Baixeries i Juvillà, Jaume; Hernandez Fernandez, Antonio; Forns, Núria; Ferrer Cancho, Ramon
Journal of quantitative linguistics
Date of publication: 2013
Journal article
Read the abstract View Share Reference managersThe relationship between the size of the whole and the size of the parts in language and music is known to follow the MenzerathAltmann law at many levels of description (morphemes, words, sentences, …). Qualitatively, the law states that the larger the whole, the smaller its parts, e.g. the longer a word (in syllables) the shorter its syllables (in letters or phonemes). This patterning has also been found in genomes: the longer a genome (in chromosomes), the shorter its chromosomes (in base pairs). However, it has been argued recently that mean chromosome length is trivially a pure power function of chromosome number with an exponent of 1. The functional dependency between mean chromosome size and chromosome number in groups of organisms from three different kingdoms is studied. The fit of a pure power function yields exponents between 1.6 and 0.1. It is shown that an exponent of 1 is unlikely for fungi, gymnosperm plants, insects, reptiles, rayfinned fishes and amphibians. Even when the exponent is very close to 1, adding an exponential component is able to yield a better fit with regard to a pure powerlaw in plants, mammals, rayfinned fishes and amphibians. The parameters of the MenzerathAltmann law in genomes deviate significantly from a power law with a 1 exponent with the exception of birds and cartilaginous fishes. 
Erratum to "Random models of MenzerathAltmann law in genomes" (BioSystems 107(3) (2012) 167¿173)
Ferrer Cancho, Ramon; Baixeries i Juvillà, Jaume; Hernandez Fernandez, Antonio
Biosystems
Date of publication: 201303
Journal article
Read the abstract View Share Reference managersHere we improve the mathematical arguments of Baixeries et al (BioSystems 107(3) (2012) 167¿173). The corrections do not alter the conclusion that the random breakage model yields an insufficient fit to the scaling of mean chromosome length as a function of chromosome number in real genomes. 
The Evolution of the exponent of Zipf's law in language ontogeny
Baixeries i Juvillà, Jaume; Elvevag, Brita; Ferrer Cancho, Ramon
PLoS One
Date of publication: 20130313
Journal article
Read the abstract Access to the full text Share Reference managersIt is wellknown that word frequencies arrange themselves according to Zipf's law. However, little is known about the dependency of the parameters of the law and the complexity of a communication system. Many models of the evolution of language assume that the exponent of the law remains constant as the complexity of a communication systems increases. Using longitudinal studies of child language, we analysed the word rank distribution for the speech of children and adults participating in conversations. The adults typically included family members (e.g., parents) or the investigators conducting the research.
It is wellknown that word frequencies arrange themselves according to Zipf's law. However, little is known about the dependency of the parameters of the law and the complexity of a communication system. Many models of the evolution of language assume that the exponent of the law remains constant as the complexity of a communication systems increases. Using longitudinal studies of child language, we analysed the word rank distribution for the speech of children and adults participating in conversations. The adults typically included family members (e.g., parents) or the investigators conducting the research. Our analysis of the evolution of Zipf's law yields two main unexpected results. First, in children the exponent of the law tends to decrease over time while this tendency is weaker in adults, thus suggesting this is not a mere mirror effect of adult speech. Second, although the exponent of the law is more stable in adults, their exponents fall below 1 which is the typical value of the exponent assumed in both children and adults. Our analysis also shows a tendency of the mean length of utterances (MLU), a simple estimate of syntactic complexity, to increase as the exponent decreases. The parallel evolution of the exponent and a simple indicator of syntactic complexity (MLU) supports the hypothesis that the exponent of Zipf's law and linguistic complexity are interrelated. The assumption that Zipf's law for word ranks is a powerlaw with a constant exponent of one in both adults and children needs to be revised. 
The exponent of Zipf¿s law in language ontogeny.
Baixeries i Juvillà, Jaume; Ferrer Cancho, Ramon
The Evolution of Language
Presentation's date: 20120313
Presentation of work at congresses
View Share Reference managers 
Computing Functional Dependencies with Pattern Structures
Baixeries i Juvillà, Jaume; Kaytoue, Mehdi; Napoli, Amedeo
Concept Lattices and Their Applications
Presentation's date: 20121012
Presentation of work at congresses
Read the abstract Access to the full text Share Reference managersThe treatment of manyvalued data with FCA has been achieved by means of scaling. This method has some drawbacks, since the size of the resulting formal contexts depends usually on the number of di erent values that are present in a table, which can be very large. Pattern structures have been proved to deal with manyvalued data, offering a viable and sound alternative to scaling in order to represent and analyze sets of manyvalued data with FCA. Functional dependencies have already been dealt with FCA using the binarization of a table, that is, creating a formal context out of a set of data. Unfortunately, although this method is standard and simple, it has an important drawback, which is the fact that the resulting context is quadratic in number of objects w.r.t. the original set of data. In this paper, we examine how we can extract the functional dependencies that hold in a set of data using pattern structures. This allows to build an equivalent concept lattice avoiding the step of binarization, and thus comes with better concept representation and computation. 
MINERIA EN DATOS BIOLOGICOS Y SOCIALES: ALGORITMOS, TEORIA E IMPLEMENTACION
Morrill, Glyn Verden; Quattoni, Ariadna Julieta; Arratia Quesada, Argimiro Alejandro; De Balle Pigem, Borja; Arias Vicente, Marta; Casas Fernandez, Bernardino; Bifet Figuerol, Albert Carles; Berral Garcia, Josep Lluis; Lopez Herrera, Josefina; Baixeries i Juvillà, Jaume; Delgado Pin, Jordi; Belanche Muñoz, Luis Antonio; Castro Rabal, Jorge; Lozano Bojados, Antoni; Ferrer Cancho, Ramon; Sierra Santibañez, Maria Josefina; Gavaldà Mestre, Ricard
Participation in a competitive project
Share 
A New Formal Context for Symmetric Dependencies
Baixeries i Juvillà, Jaume
Concept Lattices and Their Applications
Presentation's date: 20111018
Presentation of work at congresses
Read the abstract Access to the full text Share Reference managersIn this paper we present a new formal context for symmetric dependencies. We study its properties and compare it with previous approaches. We also discuss how this new context may open the door to solve some open problems for symmetric dependencies. 
Size of the whole versus number of parts in genomes
Hernandez Fernandez, Antonio; Baixeries i Juvillà, Jaume; Forns, Núria; Ferrer Cancho, Ramon
Entropy: international and interdisciplinary journal of entropy and information studies
Date of publication: 201108
Journal article
Read the abstract Access to the full text Share Reference managersIt is known that chromosome number tends to decrease as genome size increases in angiosperm plants. Here the relationship between number of parts (the chromosomes) and size of the whole (the genome) is studied for other groups of organisms from different kingdoms. Two major results are obtained. First, the finding of relationships of the kind "the more parts the smaller the whole" as in angiosperms, but also relationships of the kind "the more parts the larger the whole". Second, these dependencies are not linear in general. The implications of the dependencies between genome size and chromosome number are twofold. First, they indicate that arguments against the relevance of the finding of negative correlations consistent with MenzerathAltmann law (a linguistic law that relates the size of the parts with the size of the whole) in genomes are seriously flawed. Second, they unravel the weakness of a recent model of chromosome lengths based upon random breakage that assumes that chromosome number and genome size are independent. 
Random models of MenzerathAltmann law in genomes
Baixeries i Juvillà, Jaume; Hernandez Fernandez, Antonio; Ferrer Cancho, Ramon
Biosystems
Date of publication: 20111216
Journal article
Read the abstract Access to the full text Share Reference managersRecently, a random breakage model has been proposed to explain the negative correlation between mean chromosome length and chromosome number that is found in many groups of species and is consistent with Menzerath–Altmann law, a statistical law that defines the dependency between the mean size of the whole and the number of parts in quantitative linguistics. Here, the central assumption of the model, namely that genome size is independent from chromosome number is reviewed. This assumption is shown to be unrealistic from the perspective of chromosome structure and the statistical analysis of real genomes. A general class of random models, including that random breakage model, is analyzed. For any model within this class, a power law with an exponent of −1 is predicted for the expectation of the mean chromosome size as a function of chromosome length, a functional dependency that is not supported by real genomes. The random breakage and variants keeping genome size and chromosome number independent raise no serious objection to the relevance of correlations consistent with Menzerath–Altmann law across taxonomic groups and the possibility of a connection between human language and genomes through that law.
Postprint (author’s final draft) 
Yet a Faster Algorithm for Building the Hasse Diagram of a Galois Lattice
Baixeries i Juvillà, Jaume; Szathmary, Laszlo; Valtchev, Petko; Godin, Robert
The Seventh International Conference in Formal Concept Analysis 2009
Presentation's date: 20090522
Presentation of work at congresses
Read the abstract Access to the full text Share Reference managersFormal concept analysis (FCA) is increasingly applied to data mining problems, essentially as a formal framework for mining reduced representations (bases) of target pattern families. Yet most of the FCAbased miners, closed pattern miners, would only extract the patterns themselves out of a dataset, whereas the generality order among patterns would be required for many bases. As a contribution to the topic of the (precedence) order computation on top of the set of closed patterns, we present a novel method that borrows its overall incremental approach from two algorithms in the literature. The claimed innovation consists of splitting the update of the precedence links into a large number of lowercover list computations (as opposed to a single uppercover list computation) that unfold simultaneously. The resulting method shows a good improvement with respect to its counterpart both on its theoretical complexity and on its practical performance. It is therefore a good starting point for the design of efficient and scalable precedence miners. 
LARCA
Morrill, Glyn Verden; Baixeries i Juvillà, Jaume; Castro Rabal, Jorge; Delgado Pin, Jordi; Arias Vicente, Marta; Lopez Herrera, Josefina; Bifet Figuerol, Albert Carles; Sierra Santibañez, Maria Josefina; Balcazar Navarro, Jose Luis; Berral Garcia, Josep Lluis; Quattoni, Ariadna Julieta; Arratia Quesada, Argimiro Alejandro; De Balle Pigem, Borja; Gavaldà Mestre, Ricard
Participation in a competitive project
Share 
SECUENCIAS SIMBOLICAS:ANALISIS,APRENDIZAJE,MINERIA Y EVOLUCION  BARCELONA
Bifet Figuerol, Albert Carles; Sierra Santibañez, Maria Josefina; Baixeries i Juvillà, Jaume; Lozano Bojados, Antoni; Morrill, Glyn Verden; Berral Garcia, Josep Lluis; Arratia Quesada, Argimiro Alejandro; Delgado Pin, Jordi; Arias Vicente, Marta; Lopez Herrera, Josefina; Ferrer Cancho, Ramon; Quattoni, Ariadna Julieta; Gavaldà Mestre, Ricard
Participation in a competitive project
Share 
Formal Context for Symmetric Dependencies
Baixeries i Juvillà, Jaume
International Conference on Formal Concept Analysis
Presentation's date: 200802
Presentation of work at congresses
Share Reference managers 
Lattice Characterization of Armstrong and Symmetric Dependencies
Baixeries i Juvillà, Jaume
Defense's date: 20070709
Department of Software, Universitat Politècnica de Catalunya
Theses
Share Reference managers 
Unified characterization of symmetric dependencies with lattices
Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis
International Conference on Formal Concept Analysis
Presentation's date: 20060214
Presentation of work at congresses
Read the abstract View Share Reference managersSymmetric dependencies, or MultiValued Dependencieslike, are those dependencies that follow the deduction rules of MultiValued Dependencies (MVD’s). These are MVDclauses, Degenerate MultiValued Dependencies (DMVD’s), and MVD’s, being the latter ones relevant to the 4th normal form in the relational database model. Previous results have explained how to characterize these dependencies with lattices. However, these characterizations were adhoc, and no unified characterization has been provided yet. The purpose of this paper is to present such a common characterization for all these kind of dependencies providing the same framework for all of them, and to extend this generalization to the construction of Armstrong relations. 
A Lattice Representation of Relations, Multivalued Dependencies and Armstrong Relations
Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis
International Conference on Conceptual Structures
Presentation's date: 200507
Presentation of work at congresses
Read the abstract View Share Reference managersWe present a latticebase formalism to relate, in a novel way, different representation methods for relational data. Specifically, relations given extensionally by tuples, multivalued dependencies, and Armstrong relations for multivalued dependencies. The semantics of this formalism is based on a closure operator used to calculated the lattice. We prove that this representation yields the set of multivalued dependencies that hold in a set of tuples as well as Armstrong relation. 
New Closure Operators and Lattice Representations for Multivalued Dependencies and Related Expressions
Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis
Concept Lattices and Their Applications
Presentation's date: 200509
Presentation of work at congresses
Read the abstract Access to the full text Share Reference managersIn Database Theory, Multivalued Dependencies are the main tool to define the Fourth Normal Form and, as such, their inference problem has been deeply studied; two related notions appearing in that study are a syntactical analog in propositional logic and a restriction that maintains to this logic the same relationship as Functional Dependencies do to Horn logic. We present semantic, latticetheoretic characterizations of such multivalued dependencies that hold in a given relation, as well as similar results for the related notions just mentioned. Our characterizations explain better some previously known facts by providing a unifying framework that is also consistent with the studies of Functional Dependencies. 
Characterization and Armstrong relations for Degenerated Multivalued Dependencies using Formal Concept Analysis
Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis
International Conference on Formal Concept Analysis
Presentation's date: 20050215
Presentation of work at congresses
Read the abstract View Share Reference managersFunctional dependencies, a notion originated in Relational Database Theory, are known to admit interesting characterizations in terms of Formal Concept Analysis. In database terms, two successive, natural extensions of the notion of functional dependency are the socalled degenerate multivalued dependencies, and multivalued dependencies proper. We propose here a new Galois connection, based on any given relation, which gives rise to a formal concept lattice corresponding precisely to the degenerate multivalued dependencies that hold in the relation given. The general form of the construction departs significantly from the most usual way of handling functional dependencies. Then, we extend our approach so as to extract Armstrong relations for the degenerate multivalued dependencies from the concept lattice obtained; the proof of the correctness of this construction is nontrivial. 
Characterizations of multivalued dependencies and related expressions
Balcazar Navarro, Jose Luis; Baixeries i Juvillà, Jaume
International Conference on Discovery Science
Presentation's date: 200410
Presentation of work at congresses
View Share Reference managers 
Sampling Strategies for Finding Frequent Sets
Baixeries i Juvillà, Jaume; Casas Garriga, Gemma
Journées francophones d?Extraction et de Gestion des Connaissances (EGC'03)
Presentation's date: 2003
Presentation of work at congresses
Share Reference managers 
Discrete Deterministic Data Mining as Knowledge Compilation
Balcazar Navarro, Jose Luis; Baixeries i Juvillà, Jaume
SIAM/IFIP: Workshop on Discrete Mathematics & Data Mining: DM&DM
Presentation of work at congresses
Share Reference managers 
Using concept lattices to mine functional dependencies
Baixeries i Juvillà, Jaume
Date: 200305
Report
Share Reference managers 
A BestFirst Strategy for Finding Frequent Sets
Baixeries i Juvillà, Jaume; Balcazar Navarro, Jose Luis; Casas Garriga, Gemma
Journées francophones d?Extraction et de Gestion des Connaissances
Presentation of work at congresses
Share Reference managers 
A Faster Algorithm for Finding Frequent Sets
Baixeries i Juvillà, Jaume; Casas Garriga, Gemma; Balcazar Navarro, Jose Luis
Date: 200104
Report
Share Reference managers
Filter results
UPC network collaboration
Reference managers
Continue