Transferring compositional data methods into applied science and technology

The TRANS-CODA project pretends to extend and apply methodology from the field of compositional data analysis. Within the context of
the project we plan to carry out some theoretical work related with the distribution of log-ratio coordinates, the representation of
dependence (contingency tables, copulas) within the compositional context, and develop a compositional test for Hardy-Weinberg
equilibrium. However, most of our efforts will be directed towards the application of compositional statistical methodology in other fields of
science. The project has three main research lines: a) Biomarkers and genetic markers, b) Contamination, natural risks and climate
change, c) Comparison and characterization of alimentation systems. The first research line, lead by professor Graffelman, concerns the
analysis of large databases of biomarkers (the so-called "omics" data), and databases of genetic markers (single nucleotide
polymorphisms (SNPs) and short tandem repeats (STRs)) using methods from multivariate analysis, statistical genetics and methods for
compositional data analysis. Due to a rapidly developing technology, modern laboratory equipment generates ever-growing databases,
whose statistical analysis is a real challenge, at the intersection of informatics, statistics, methods for compositional data analysis and
biological knowledge. In the light of the molecular mechanisms and genetic factors that play a role in many diseases, this research line has
a direct impact on challenge 1 on human health of the present call for research proposals. The second research line, lead by professor
Ortego, studies natural process like waves, rainfall, as well as air pollution and others, and is geared towards the analysis of extreme
values. This line deals with the application of techniques of compositional data analysis and the multivariate treatment of these
phenomena, considering the interdependence of the observations (spatial or non-spatial) and the application of Bayesian methods for
estimating probabilities of occurrence and the detection of change-points in time series. This research line deals with topics like pollution
and climate change and is therefore directly related to challenge 5 of this call. The third research line, lead by professor Jarauta, concerns
the application of models for compositional data used to obtain a comparative characterization of alimentation systems from a global point
of view. It plays an important role in human development for its biological basis and its social and economical implications, elaborating a
conceptual model with four dimensions: Availability, Economy, Politics and Knowledge, and tries to develop this model analytically. This
research line relates to Challenge 2 of the current call.
