Graphic summary
  • Show / hide key
  • Information


Scientific and technological production
  •  

1 to 50 of 54 results
  • The mid p-value in exact tests for Hardy-Weinberg equilibrium

     Graffelman, Jan; Moreno Aguado, Victor
    Statistical applications in genetics and molecular biology
    Date of publication: 2013-08-03
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Objective: Exact tests for Hardy-Weinberg equilibrium are widely used in genetic association studies. We evaluate the mid p-value, unknown in the genetics literature, as an alternative for the standard p-value in the exact test. Method: The type 1 error rate and the power of the exact test are calculated for different sample sizes, sigificance levels, minor allele counts and degrees of deviation from equilibrium. Three different p-value are considered: the standard two-sided p-value, the doubled one-sided p-value and the mid p-value. Practical implications of using the mid p-value are discussed with HapMap datasets and a data set on colon cancer. Results: The mid p-value is shown to have a type 1 error rate that is always closer to the nominal level, and to have better power. Differences between the standard p-value and the mid p-value can be large for insignificant results, and are smaller for significant results. The analysis of empirical databases shows that the mid p-value uncovers more significant markers, and that the equilibrium null distribution is not tenable for both databases. Conclusion: The standard exact p-value is overly conservative, in particular for small minor allele frequencies. The mid p-value ameliorates this problem by bringing the rejection rate closer to the nominal level, at the price of ocasionally exceeding the nominal level.

  • Factor Analysis

     Graffelman, Jan
    Date of publication: 2013
    Book chapter

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Factor analysis is a multivariate statistical method for data reduction that originated in psychometrics and has found applications in many branches of science. The method aims to describe the correlation structure between a large set of observed variables in terms of a few underlying latent variables called factors. Factor analysis employs a specific model, where observed variables are modelled as linear combinations of common factors plus a specific error term. This model can be estimated by using principal components, by using the iterative principal factor method, or by maximum likelihood. After estimation, factors may be rotated in order to improve their interpretation. An example of the application of factor analysis to a set of pollutants in an environmental monitoring study is discussed.

  • Linear-Angle correlation plots: New graphs for revealing correlation structure

     Graffelman, Jan
    Journal of computational and graphical statistics
    Date of publication: 2013-03-27
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    In multivariate graphics, correlations between variables are often approximated by the cosines of the angles between vectors. In practice, it is difficult to reliably estimate correlations from such displays by eye. In this article, we therefore develop new graphs, called linear-angle correlation plots, that have a linear relationship between correlation and angle, and from which correlation coefficients are read off more easily. Several multivariate datasets are used to illustrate the proposed graphs. The goodness-of-fit properties of the new graphs are compared with standard multivariate methods such as principal component analysis and principal factor analysis. Cosine-based plots typically gave the poorest approximation to the correlation matrix. A linear interpretation rule for the angle often improved the fit. The best fit was generally obtained by principal factor analysis using scalar products to approximate correlations

  • Statistical inference for Hardy-Weinberg proportions in the presence of missing genotype information

     Graffelman, Jan; Sánchez, Milagros; Cook, Samantha; Moreno Aguado, Victor Raul
    PLoS One
    Date of publication: 2013-12-31
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    In genetic association studies, tests for Hardy-Weinberg proportions are often employed as a quality control checking procedure. Missing genotypes are typically discarded prior to testing. In this paper we show that inference for Hardy-Weinberg proportions can be biased when missing values are discarded. We propose to use multiple imputation of missing values in order to improve inference for Hardy-Weinberg proportions. For imputation we employ a multinomial logit model that uses information from allele intensities and/or neighbouring markers. Analysis of an empirical data set of single nucleotide polymorphisms possibly related to colon cancer reveals that missing genotypes are not missing completely at random. Deviation from Hardy-Weinberg proportions is mostly due to a lack of heterozygotes. Inbreeding coefficients estimated by multiple imputation of the missings are typically lowered with respect to inbreeding coefficients estimated by discarding the missings. Accounting for missings by multiple imputation qualitatively changed the results of 10 to 17% of the statistical tests performed. Estimates of inbreeding coefficients obtained by multiple imputation showed high correlation with estimates obtained by single imputation using an external reference panel. Our conclusion is that imputation of missing data leads to improved statistical inference for Hardy-Weinberg proportions.

  • Testing Hardy-Weinberg equilibrium: a compositional approach

     Graffelman, Jan
    International Workshop on Compositional Data Analysis
    Presentation's date: 2011-05-13
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Statistical inference for Hardy-Weinberg equilibrium using log-ratio coordinates

     Graffelman, Jan
    International Workshop on Compositional Data Analysis
    Presentation's date: 2011-05-11
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Similarity in recombination rate estimates highly correlates with genetic differentiation in humans

     Laayouni, Hafid; Montanucci, Ludovica; Sikora, Martin; Melé, Marta; Dall¿Olio, Giovanni Marco; Lorente-Galdos, Belén; McGee, Kate M.; Graffelman, Jan; Awadalla, Philip; Bosch, Elena; Comas, David; Navarro, Arcadi; Calafell, Francesc; Casals, Ferran; Bertranpetit, Jaume
    PLoS One
    Date of publication: 2011-03-28
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • A Universal Procedure for Biplot Calibration

     Graffelman, Jan
    Date of publication: 2011
    Book chapter

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Hardy-Weinberg equilibrium: a non-parametric compositional approach

     Graffelman, Jan; Egozcue Rubi, Juan José
    Date of publication: 2011
    Book chapter

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Patent value models: Partial least squares path modelling with mode C and few indicators  Open access

     Martínez Ruiz, Alba
    Defense's date: 2011-01-27
    Department of Statistics and Operations Research, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Two general goals were raised in this thesis: First, to establish a PLS model for patent value and to investigate causality relationships among variables that determine the patent value; second, to investigate the performance of Partial Least Squares (PLS) Path Modelling with Mode C inthe context of patent value models. This thesis is organized in 10 chapters. Chapter 1 presents an introduction to the thesis that includes the objectives, research scope and the document’s structure. Chapter 2 gives an overview of the different approaches for patent value from the perspective of technological change. Definitions related to patent documents and patent indicators are provided. Chaper 3 reports on patent sample descriptions. We present criteria to retrieve data, the procedure for calculating patent indicators, and a statistical data description. Chapter 4 provides an introduction to structural equation models (SEMs) including origins, basic background and recent developments. In addition, it provides guidelines for model specification and modelling process for SEMs. Special emphasis is placed on determining the reflective or formative nature of measurement models. Chapter 5 puts forward the main PLS algorithms: NIPALS, PLS Regression and PLS Path Modelling. We present two path modelling implementations: Lohmöller and Wold’s procedures. Additionally, insights are given on procedure sensitivity to starting weight values and weighting schemes; algorithm properties, such as consistency and consistency at large; and convergence. We briefly review some PLS Path Modelling extensions and relationships with other procedures. The chapter ends by describing validation techniques. Chapter 6 provides evidence about the accuracy and precision of PLS Path Modelling with Mode C to recover true values in SEMs with few indicators per construct. Monte Carlo simulations and computational experiments are carried out to study the performance of the algorithm. Chapter 7 addresses the formulation and estimation of patent value models. This entails the identification and definition of observable and unobservable variables, the determination of blocks of manifest variables and structural relationships, the specification of a first- and a second-order models of patent value, and the models’ estimation by PLS Path Modelling. In Chapter 8, the evolution of patent value over time using longitudinal SEMs is investigated. Two set-ups are explored. The first longitudinal model includes time-dependent manifest variables and the second includes time-dependent unobservable variables. The SEMs are estimated using PLS Path Modelling. In Chapter 9, there is a description of a Two-Step PLS Path Modelling with Mode C (TsPLS) procedure to study nonlinear and interaction effects among formative constructs. Monte Carlo simulations are performed to generate data and to determine the accuracy and precision of this approach to recover true values. This chapter includes an application of the TsPLS algorithm to patent value models. Finally, in Chapter 10, we provide a summary of conclusions and future researchs. The main contribution of this thesis is to set-up a PLS model for patent value, and around this issue, we have also contributed in two main areas: Contributions to the field of Technological Change are comprised of: (1) Evidence on the role of the knowledge stock, technological scope and international scope as determinants of patent value and technological usefulness. A stable pattern of path coefficients was found across samples in different time periods. (2) To conceptualize the patent value as a potential and a recognized value for intangible assets. It was also shown that the potential value of patent is small compared to the value that is given later. (3) Evidence for the importance of considering the longitudinal nature of the indicators in the patent value problem, especially for forward citations, which are the most widely used indicator of patent value. (4) To introduce a multidimensional perspective of the patent valuation problem. This novel approach may offer a robust understanding of the different varia bles that determine patent value. Contributions to the field of PLS Path Modelling are comprised of: (5) Empirical evidence on the performance of PLS Path Modelling with Mode C. If properly implemented, the procedure can adequately capture some of the complex dynamic relationships involved in models. Our research shows that PLS Path Modelling with Mode C performs according to the theoretical framework established for PLS procedures and PLS-models (Wold, 1982; Krämer, 2006; Hanafi, 2007; Dijkstra, 2010). (6) Empirical evidence for the consistency at large of the PLS Path Modelling with Mode A. (7) Empirical evidence for formative outer models with few manifest variables. (8) Empirical evidence on the performance of a Two-Step PLS Path Modelling with Mode C procedure to estimate nonlinear and interaction effects among formative constructs.

    Dos objetivos general fueron planteados en esta tesis. Primero, establacer un modelo PLS para el valor de las patentes e investigar las relaciones de causalidad entre las variables que determinan el valor de las patentes. Segundo, investigar el desempeño del procedimiento Partial Least Squares (PLS) Path Modelling con Modo C en el contexto de los modelos de valor de las patentes. La tesis es organizada en 10 capítulos. El Capítulo 1 presenta una introducción a la tesis que incluye los objetivos, el alcance de la investigación y la estructura del documento. El Capítulo 2 entrega una presentación general de los diferentes enfoques para valoración de patentes desde una perspectiva del cambio tecnológico. También se entregan las definiciones necesarias relacionadas con los documentos e indicadores de patentes. El Capítulo 3 describe la muestra de patentes usada en esta investigación. Se presentan los criterios utilizados para recuperar los datos, el procedimiento seguido para calcular los indicadores de patentes y la descripción estadística de la muestra. El Capítulo 4 provee una introducción a los modelos de ecuaciones estructurales (SEMs) incluyendo orígenes, antecedentes básicos y desarrollos recientes. Además se entregan los lineamientos para la especificación de los modelos y el proceso de modelamiento para SEMs. Este capítulo discute con especial énfasis la determinación de la naturaleza reflectiva o formativa de los modelos de medida. El Capítulo 5 presenta los principales algoritmos PLS: NIPALS, Regresión PLS y PLS Path Modelling. Se presentan dos implementaciones de PLS Path Modelling: los procedimientos de Lohmöller y Wold. Adicionalmente, se analyzan resultados previos relacionados con: la sensibilidad del procedimiento al valor inicial de los vectores de pesos y el esquema de ponderación, y las propiedades del algoritmo, tales como consistencia, consistencia “at large” y convergencia. También brevemente se revisan algunas extensiones del procedimiento y su relación con otros métodos. El capítulo termina describiendo las técnicas de validación. El Capítulo 6 provee evidencia acerca de la exactitud y precisión con que PLS Path Modelling con Modo C recupera valores verdaderos en SEMs con pocos indicadores por constructo. Simulaciones Monte Carlo y experimentos computacionales son llevados a cabo para estudiar el rendimiento del algoritmo. El Capítulo 7 trata la formulación y estimación de los modelos de valoración de patentes. Esto comprende la identificación y definición de las variables observables y no observables, la determinación de los bloques de variables manifiestas y las relaciones estructurales, la especificación de los modelos de primer y segundo orden del valor de las patentes y la estimación de los mismos con PLS Path Modelling. En el Capítulo 8, la evolución del valor de las patentes a través del tiempo es investigado usando SEMs longitudinales. Dos set-ups son explorados. El primer modelo longitudinal considera variables manifiestas dependientes del tiempo y el segundo incluye variables latentes dependientes del tiempo. Los SEMs son estimados usando PLS Path Modelling. En el Capítulo 9, el procedimiento Two-Step PLS Path Modelling con Modo C (TsPLS) es implementado para estudiar los efectos no lineales y de interacción entre constructos formativos. Simulaciones Monte Carlo son llevadas a cabo para generar datos y determinar la exactitud y precisión con que este enfoque recupera valores verdaderos. Este capítulo incluye una aplicación del procedimiento a los modelos de patentes. Finalmente, el Capítulo 10 provee un resumen de las conclusiones y lineamientos para futuras investigaciones. La principal contribución de esta tesis es proponer modelos PLS para el valor de las patentes, y alrededor de este objetivo, nosotros hemos también contribuido en dos áreas principales: Contribuciones en el área del Cambio Tecnológico comprenden: (1) Evidencia empírica del rol del stock de conocimiento, el alcance tecnológico y el alcance internacional como determinantes del valor de las patentes y la utilidad tecnológica. Un patrón estable de coeficientes de trayectoria fue encontrado al estimar los modelos con muestras en diferentes periodos de tiempo. (2) Conceptualizar el valor de las patentes en un valor potencial y uno reconocido. También proveer evidencia acerca de que el valor potencial es pequeño al compararlo con el valor que las patentes adquieren con posterioridad. (3) Evidencia acerca de la importancia de considerar la naturaleza longitudinal de los indicatores en el problema de valorización de patentes, especialmente de las citas recibidas, el indicador de valor más utilizado. (4) Introducir una perspectiva multidimensional en el problema de valoración de patentes. Este nuevo enfoque puede ofrecer un entendimiento robusto de las diferentes variables que determinar el valor de las patentes. Contribuciones en el área del PLS PLS Path Modelling comprenden: (5) Evidencia empírica acerca del desempeño de PLS Path Modelling con Modo C. Apropiadamente implemetado, el procedimiento puede adecuadamente capturar algunas de las complejas relaciones dinámicas en los modelos. Nuestra investigación muestra que PLS Path Modelling con Modo C se comporta de acuerdo al marco teórico establecido para los procedimientos PLS y los modelos PLS (Wold, 1982; Krämer, 2006; Hanafi, 2007; Dijkstra, 2010). Es decir, (a) las estimaciones PLS estan siempre sesgadas, (b) las relaciones internas son subestimadas, (c) las relaciones externas son sobrestimadas, (d) el Modo A carece de la propiedad de convergencia monótona, (3) el Modo B tiene la propiedad de convergencia monótona. (6) Evidencia empírica acerca de la convergencia “at large” de PLS Path Modelling con Modo A. (7) Evidencia empírica para los modelos formativos con pocos indicadores (8) Evidencia empírica del desempeño del procedimiento Two-Step PLS Path Modelling con Modo C para estimar efectos no lineales y de interacción entre constructos formativos.

  • The number of markers in the HapMap project: some notes on Chi-square and Exact tests for Hardy-Weinberg equilibrium

     Graffelman, Jan
    American journal of human genetics
    Date of publication: 2010-05-14
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Multivariate variance components linkage analysis applications to the search of genes related to complex diseases

     Buil Demur, Alfonso Alberto
    Defense's date: 2010-10-21
    Department of Statistics and Operations Research, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Biplots in practice

     Graffelman, Jan
    Date: 2010-12
    Report

     Share Reference managers Reference managers Open in new window

  • Decay of linkage disequilibrium within genes across HGDP-CEPH human samples: most population isolates do not show increased LD

     Bosch, E; Laayouni, H.; Morcillo-Suarez, C.; Casals, F.; Moreno-Estrada, A.; Ferrer-Admetlla, A.; Gardner, M.; Rosa, A; Navarro, A; Comas, D.; Graffelman, Jan; Calafell, F.; Bertranpetit, J.
    BMC genomics
    Date of publication: 2009-07
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Statistical tests for Hardy-Weinberg Equilibrium and Linkage Disequilibrium: graphical methods in the presence of multiple markers.

     Morales Camarena, Jair Gabriel
    Defense's date: 2009-02-04
    Department of Statistics and Operations Research, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Análisis de valoraciones atípicas en los estudios de ingeniería Kansei: Consideraciones estadística y prácticas

     Alvarez Laverde, Hector Rene
    Defense's date: 2009-11-27
    Department of Statistics and Operations Research, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Environmental and ecological statistics

     Graffelman, Jan
    Collaboration in journals

     Share

  • A universal procedure for biplot calibration

     Graffelman, Jan
    A universal procedure for biplot calibration
    Presentation's date: 2009-09-09
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • A global test for Hardy-Weinberg equilibrium

     Graffelman, Jan
    Congreso Nacional de Estadística e Investigación Operativa
    Presentation's date: 2009-02-10
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Journal of computational and graphical statistics

     Graffelman, Jan
    Collaboration in journals

     Share

  • Journal of computational and graphical statistics

     Graffelman, Jan
    Collaboration in journals

     Share

  • American statistician

     Graffelman, Jan
    Collaboration in journals

     Share

  • Psychometrika

     Graffelman, Jan
    Collaboration in journals

     Share

  • Collegium antropologicum

     Graffelman, Jan
    Collaboration in journals

     Share

  • Scientia marina

     Graffelman, Jan
    Collaboration in journals

     Share

  • Hardy-Weinberg Equilibrium and the Ternary Plot

     Graffelman, Jan; Morales-Camarena, Jair
    Compositional Data Analysis Workshop (CodaWork '08)
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Graphical tests for Hardy-Weinberg equilibrium based on the ternary plot

     Graffelman, Jan; Camarena, J M
    Human heredity
    Date of publication: 2007-11
    Journal article

     Share Reference managers Reference managers Open in new window

  • Variation in Estimated Recombination Rates across Human Populations

     Graffelman, Jan; Balding, D.J.; Gonzalez-Neira, A.; Bertranpetit, J.
    Human genetics
    Date of publication: 2007-11
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Computational statistics and data analysis

     Graffelman, Jan
    Collaboration in journals

     Share

  • Human reproduction

     Graffelman, Jan
    Collaboration in journals

     Share

  • Human reproduction

     Graffelman, Jan
    Collaboration in journals

     Share

  • SORT: statistics and operations research transactions

     Graffelman, Jan
    Collaboration in journals

     Share

  • Journal of ecology

     Graffelman, Jan
    Collaboration in journals

     Share

  • Enriched biplots for canonical correlation analysis

     Graffelman, Jan
    Journal of Applied Statistics
    Date of publication: 2005-04
    Journal article

     Share Reference managers Reference managers Open in new window

  • Mathematical geology

     Graffelman, Jan
    Collaboration in journals

     Share

  • Site scores and conditional biplots in canonical correspondence analysis

     Graffelman, Jan
    Environmetrics
    Date of publication: 2004-02
    Journal article

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Optimal representation of supplementary variables in biplots from principal component analysis and correspondence analysis

     Graffelman, Jan; Aluja Banet, Tomas
    Biometrical journal
    Date of publication: 2003-05
    Journal article

     Share Reference managers Reference managers Open in new window

  • Biplots with supplementary data

     Graffelman, Jan
    Conferencia Española de Biometría
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Quality statistics in canonical correspondence analysis

     Graffelman, Jan
    Environmetrics
    Date of publication: 2001-07
    Journal article

     Share Reference managers Reference managers Open in new window

  • Factor Analysis

     Graffelman, Jan
    Date of publication: 2001-10
    Book chapter

     Share Reference managers Reference managers Open in new window

  • Describing the Distribution of Species Counts by Poisson Mixtures

     Graffelman, Jan
    Conferencia Española de Biometría
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • A Statistical Analysis of the effect of Warfare on the Human Secondary Sex Ratio

     Graffelman, Jan
    Human biology
    Date of publication: 2000-06
    Journal article

     Share Reference managers Reference managers Open in new window

  • Use of the Moore-Penrose Inverse in Canonical Correspondence Analysis

     Graffelman, Jan
    Econometric theory
    Date of publication: 2000-10
    Journal article

     Share Reference managers Reference managers Open in new window

  • Contributions to the multivariate Analysis of Marine Environmental Monitoring Data: Methodological Aspects and Applications.

     Graffelman, Jan
    Defense's date: 2000-09-12
    Department of Statistics and Operations Research, Universitat Politècnica de Catalunya
    Theses

     Share Reference managers Reference managers Open in new window

  • Human reproduction

     Graffelman, Jan
    Collaboration in journals

     Share

  • Use of the Zero-inflated Poisson for Describing the Distribution of Species Abundance

     Graffelman, Jan
    The Second Spanish STATA users meeting
    Presentation's date: 2000-05-01
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • The Justification of Multidimensional Scaling under Euclidean Conditions

     Graffelman, Jan
    Econometric theory
    Date of publication: 1999-01
    Journal article

     Share Reference managers Reference managers Open in new window

  • Upper bounds for the eigenvalues of the product of a symmetric idempotent and a non-negative definite matrix

     Graffelman, Jan; Velden, Van De M
    Econometric theory
    Date of publication: 1999-02
    Journal article

     Share Reference managers Reference managers Open in new window