Farreras Esclusa, Montserrat
Total activity: 34
Research group
CAP - High Performace Computing Group
Department
Department of Computer Architecture
School
Barcelona School of Telecommunications Engineering (ETSETB)
E-mail
montserrat.farrerasestudiant.upc.edu
Contact details
UPC directory Open in new window

Graphic summary
  • Show / hide key
  • Information


Scientific and technological production
  •  

1 to 34 of 34 results
  • Parallel Continuous Flow: A Parallel Suffix Tree Construction Tool for Whole Genomes

     Comin, Matteo; Farreras Esclusa, Montserrat
    Journal of computational biology
    Vol. 21, num. 4, p. 330-344
    DOI: 10.1089/cmb.2012.0256
    Date of publication: 2014-04-01
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    The construction of suffix trees for very long sequences is essential for many applications, and it plays a central role in the bioinformatic domain. With the advent of modern sequencing technologies, biological sequence databases have grown dramatically. Also the methodologies required to analyze these data have become more complex everyday, requiring fast queries to multiple genomes. In this article, we present parallel continuous flow (PCF), a parallel suffix tree construction method that is suitable for very long genomes. We tested our method for the suffix tree construction of the entire human genome, about 3GB. We showed that PCF can scale gracefully as the size of the input genome grows. Our method can work with an efficiency of 90% with 36 processors and 55% with 172 processors. We can index the human genome in 7 minutes using 172 processes.

  • Models de Programacio i Entorns d'eXecució PARal.lels

     Becerra Fontal, Yolanda; Carrera Perez, David; Corbalan Gonzalez, Julita; Cortes Rossello, Antonio; Costa Prats, Juan Jose; Farreras Esclusa, Montserrat; Gil Gómez, Maria Luisa; Gonzalez Tallada, Marc; Guitart Fernández, Jordi; Herrero Zaragoza, José Ramón; Labarta Mancho, Jesus Jose; Martorell Bofill, Xavier; Navarro Mas, Nacho; Nin Guerrero, Jordi; Torres Viñals, Jordi; Tous Liesa, Ruben; Utrera Iglesias, Gladys Miriam; Ayguade Parra, Eduard
    Competitive project

     Share

  • AHAS: cognitive-emotional atoms that facilitate or disturb the process of learning

     Armengol Cebrian, Jesús; Bofill Soliguer, Pau; Farreras Esclusa, Montserrat
    Active Learning in Engineering Education Workshop
    Presentation's date: 2014-01-21
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Optimization techniques for fine-grained communication in PGAS environments  Open access

     Alvanos, Michail
    Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Los lenguajes de programación basados en la técnica del Partitioned Global Address Space (PGAS) prometen ofreceruna mejor productividad del programador y un buen rendimiento en ordenadores paralelos a gran escala. Sin embargo, es difícil de lograr un rendimiento adecuado para aplicaciones que se basan en la comunicación de grano fino sin comprometer su programabilidad. Habitualmente se requiere de asistencia manual o por parte del compilador, para la optimización de código para evitar los accesos a datos de grano fino. La desventaja de aplicar manualmente transformaciones de código es el aumento de la complejidad del programa, lo que reduce enórmemente la productividad del programador. Por otro lado, las optimizaciones que puede realizar el compilador en los accesos de grano fino requieren del conocimiento de la asignación de datos físico y el uso de construcciones de bucle paralelas.Esta tesis presenta optimizaciones para resolver los tres problemas principales de la comunicación de grano fino: (i) la baja eficiencia de las comunicaciones de red, (ii) la gran cantidad de llamadas en tiempo de ejecución , y (iii) la aparición de congestión en la red de comunicaciones, debida a la distribución no uniforme de los datos.Para resolver estos problemas, la tesis presenta tres enfoques. En primer lugar, se presenta una transformacióninspector-ejecutor mejorada, para aumentar la eficiencia de la red a través de la agregación de datos en tiempo de ejecución. En segundo lugar, se presentan optimizaciones adicionales a la transformación del bucle inspector-ejecutorpara eliminar automáticamente las llamadas en tiempo de ejecución . Por último, la tesis presenta una transformación de bucles para evitar congestión en la red de comunicaciones y la sobrecarga de los nodos. A diferencia de trabajos previos que utilizan agregación de datos estática, precarga, privatización de datos con limitaciones, y gestión de cache en software, las soluciones presentadas en esta tesis cubren todos los aspectos relacionados con la comunicación de grano fino, incluyendo la reducción del número de llamadas generadas por el compilador y minimizando la sobrecarga de las optimizaciones de la técnica inspector-ejecutor.Se realiza una evaluación de las propuestas con varios microbenchmarks y benchmarks, con el objetivo de determinar su escalabilidad y rendimiento en la arquitectura Power 775. Los resultados indican que aplicaciones con accesos regulares a datos, llegan a obtener hasta un 180% del rendimiento obtenido en versiones optimizadas a mano, mientras que en aplicaciones con accesos irregulares a datos, se espera que las transformaciones puedan producir versiones desde 1,12x hasta 6,3 veces más veloces. Las técnicas de planificación de bubles muestran mejoras de rendimientoentre el 3% y el 25%, para NAS FT y aplicaciones de ordenación, y hasta 3,4x en los microbenchmarks.

    Partitioned Global Address Space (PGAS) languages promise to deliver improved programmer productivity and good performance in large-scale parallel machines. However, adequate performance for applications that rely on fine-grained communication without compromising their programmability is difficult to achieve. Manual or compiler assistance code optimization is required to avoid fine-grained accesses. The downside of manually applying code transformations is the increased program complexity and hindering of the programmer productivity. On the other hand, compiler optimizations of fine-grained accesses require knowledge of physical data mapping and the use of parallel loop constructs. This thesis presents optimizations for solving the three main challenges of the fine-grain communication: (i) low network communication efficiency; (ii) large number of runtime calls; and (iii) network hotspot creation for the non-uniform distribution of network communication, To solve this problems, the dissertation presents three approaches. First, it presents an improved inspector-executor transformation to improve the network efficiency through runtime aggregation. Second, it presents incremental optimizations to the inspector-executor loop transformation to automatically remove the runtime calls. Finally, the thesis presents a loop scheduling loop transformation for avoiding network hotspots and the oversubscription of nodes. In contrast to previous work that use static coalescing, prefetching, limited privatization, and caching, the solutions presented in this thesis focus cover all the aspect of fine-grained communication, including reducing the number of calls generated by the compiler and minimizing the overhead of the inspector-executor optimization. A performance evaluation with various microbenchmarks and benchmarks, aiming at predicting scaling and absolute performance numbers of a Power 775 machine, indicates that applications with regular accesses can achieve up to 180% of the performance of hand-optimized versions, while in applications with irregular accesses the transformations are expected to yield from 1.12X up to 6.3X speedup. The loop scheduling shows performance gains from 3-25% for NAS FT and bucket-sort benchmarks, and up to 3.4X speedup for the microbenchmarks.

  • Efficient parallel construction of suffix trees for genomes larger than main memory

     Comin, Matteo; Farreras Esclusa, Montserrat
    European MPI Users' Group Meeting
    p. 211-216
    DOI: 10.1145/2488551.2488579
    Presentation's date: 2013-09
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    The construction of suffix tree for very long sequences is essential for many applications, and it plays a central role in the bioinformatic domain. With the advent of modern sequencing technologies, biological sequence databases have grown dramatically. Also the methodologies required to analyze these data have become everyday more complex, requiring fast queries to multiple genomes. In this paper we presented Parallel Continuous Flow PCF, a parallel suffix tree construction method that is suitable for very long strings. We tested our method on the construction of suffix tree of the entire human genome, about 3GB. We showed that PCF can scale gracefully as the size of the input string grows. Our method can work with an efficiency of 90% with 36 processors and 55% with 172 processors. We can index the Human genome in 7 minutes using 172 nodes.

  • Improving performance of all-to-all communication through loop scheduling in PGAS environments

     Alvanos, Michail; Tanase, Gabriel; Farreras Esclusa, Montserrat; Tiotto, Ettore; Amaral, José Nelson; Martorell Bofill, Xavier
    ACM/IEEE International Conference on Supercomputing
    p. 457
    DOI: 10.1145/2464996.2467277
    Presentation's date: 2013-06
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Improving communication in PGAS environments: Static and dynamic coalescing in UPC

     Alvanos, Michail; Farreras Esclusa, Montserrat; Tiotto, Ettore; Amaral, José Nelson; Martorell Bofill, Xavier
    ACM/IEEE International Conference on Supercomputing
    p. 129-138
    DOI: 10.1145/2464996.2465006
    Presentation's date: 2013-06
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    The goal of Partitioned Global Address Space (PGAS) languages is to improve programmer productivity in large scale parallel machines. However, PGAS programs may have many fine-grained shared accesses that lead to performance degradation. Manual code transformations or compiler optimizations are required to improve the performance of programs with fine-grained accesses. The downside of manual code transformations is the increased program complexity that hinders programmer productivity. On the other hand, most compiler optimizations of fine-grain accesses require knowledge of physical data mapping and the use of parallel loop constructs. This paper presents an optimization for the Unified Parallel C language that combines compile time (static) and runtime (dynamic) coalescing of shared data, without the knowledge of physical data mapping. Larger messages increase the network efficiency and static coalescing decreases the overhead of library calls. The performance evaluation uses two microbenchmarks and three benchmarks to obtain scaling and absolute performance numbers on up to 32768 cores of a Power 775 machine. Our results show that the compiler transformation results in speedups from 1.15X up to 21X compared with the baseline versions and that they achieve up to 63% the performance of the MPI versions.

  • A high-productivity task-based programming model for clusters

     Tejedor, Enric; Farreras Esclusa, Montserrat; Grove, David; Badia Sala, Rosa Maria; Almási, George; Labarta Mancho, Jesus Jose
    Concurrency and computation. Practice and experience
    Vol. 24, num. 18, p. 2421-2448
    DOI: 10.1002/cpe.2831
    Date of publication: 2012-12-15
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Programming for large-scale, multicore-based architectures requires adequate tools that offer ease of programming and do not hinder application performance. StarSs is a family of parallel programming models based on automatic function-level parallelism that targets productivity. StarSs deploys a data-flow model: it analyzes dependencies between tasks and manages their execution, exploiting their concurrency as much as possible. This paper introduces Cluster Superscalar (ClusterSs), a new StarSs member designed to execute on clusters of SMPs (Symmetric Multiprocessors). ClusterSs tasks are asynchronously created and assigned to the available resources with the support of the IBM APGAS runtime, which provides an efficient and portable communication layer based on one-sided communication. We present the design of ClusterSs on top of APGAS, as well as the programming model and execution runtime for Java applications. Finally, we evaluate the productivity of ClusterSs, both in terms of programmability and performance and compare it to that of the IBM X10 language

    Programming for large-scale, multicore-based architectures requires adequate tools that offer ease of programming and do not hinder application performance. StarSs is a family of parallel programming models based on automatic function-level parallelism that targets productivity. StarSs deploys a data-flow model: it analyzes dependencies between tasks and manages their execution, exploiting their concurrency as much as possible. This paper introduces Cluster Superscalar (ClusterSs), a new StarSs member designed to execute on clusters of SMPs (Symmetric Multiprocessors). ClusterSs tasks are asynchronously created and assigned to the available resources with the support of the IBM APGAS runtime, which provides an efficient and portable communication layer based on one-sided communication. We present the design of ClusterSs on top of APGAS, as well as the programming model and execution runtime for Java applications. Finally, we evaluate the productivity of ClusterSs, both in terms of programmability and performance and compare it to that of the IBM X10 language

  • Learning the principles of parallel computing with games

     Alvarez Mesa, Mauricio; Bofill Soliguer, Pau; Sánchez Castaño, Friman; Farreras Esclusa, Montserrat
    Active Learning in Engineering Education Workshop
    Presentation of work at congresses

    Read the abstract Read the abstract  Share Reference managers Reference managers Open in new window

    The trend towards parallel computers requires a fundamental change in the way software is developed in order to maintain performance scalability. Because of that, it is required that most software developers have a solid knowledge on how to develop parallel programs. In this paper we present a methodology for learning parallel computing that gives priority to the general principles rather than technologies that use them. Parallel computing is presented as a specific case of the general coordination problem and, based on that, the fundamental issues of coordination systems are presented. Coordination is modelled as a cooperative game, in which learners (players) contribute to a common goal. Two games are presented as an example: the ¿orange game¿, and a game based on the ¿dining philosophers¿ problem. Those games only use ordinary materials (not computers) such as cards, drawing paper, colours and oranges, and allow to illustrate problems in coordination systems like mutual exclusion and deadlocks.

  • Productive cluster programming with OmpSs

     Bueno Hedo, Javier; Martinell, Lluis; Duran Gonzalez, Alejandro; Farreras Esclusa, Montserrat; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguade Parra, Eduard; Labarta Mancho, Jesus Jose
    International European Conference on Parallel and Distributed Computing
    p. 555-566
    DOI: 10.1007/978-3-642-23400-2_52
    Presentation's date: 2011-09-01
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • ClusterSs: a task-based programming model for clusters

     Tejedor Saavedra, Enric; Farreras Esclusa, Montserrat; Badia Sala, Rosa Maria; Grove, David; Almási, George; Labarta Mancho, Jesus Jose
    International Symposium on High Performance Distributed Computing
    p. 267-268
    DOI: 10.1145/1996130.1996168
    Presentation's date: 2011
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Asynchronous PGAS runtime for Myrinet networks

     Farreras Esclusa, Montserrat; Almasi, George
    Conference on Partitioned Global Address Space Programming Model
    p. 1-10
    DOI: 10.1145/2020373.2020377
    Presentation's date: 2010-12
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • 9º ALE (ACTIVE LEARNING IN ENGINEERING EDUCATION)

     Bofill Soliguer, Pau; Farreras Esclusa, Montserrat; Otero Calviño, Beatriz; Armengol Cebrian, Jesús
    Competitive project

     Share

  • An unified parallel C compiler that implements automatic communication aggregation

     Barton, Christopher; Almási, George; Farreras Esclusa, Montserrat; Amaral, José Nelson
    Workshop on Compilers for Parallel Computing
    Presentation's date: 2009-01-07
    Presentation of work at congresses

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Partitioned Global Address Space (PGAS) programming languages, such as Unified Parallel C (UPC), offer an attractive high-productivity programming model for programming large-scale parallel machines. PGAS languages partition the application¿s address space into private, shared-local and shared-remote memory. When running in a distributed-memory environment, accessing shared-remote memory leads to implicit communication. For fine-grained accesses, which are frequently found in UPC programs, this communication overhead can significantly impact program performance. One solution for reducing the number of fine-grained accesses is to coalesce several accesses into a single access. This paper presents an analysis to identify opportunities for coalescing and an algorithm that allows the compiler to automatically coalesce accesses to shared-remote memory in UPC. It also describes how opportunities for coalescing can be created by the compiler through loop unrolling. Results obtained from coalescing accesses in manually-unrolled parallel loops are presented to demonstrate the benefit of combining parallel loop unrolling and communication coalescing.

    Partitioned Global Address Space (PGAS) programming languages, such as Unified Parallel C (UPC), offer an attractive high-productivity programming model for programming large-scale parallel machines. PGAS languages partition the application’s address space into private, shared-local and shared-remote memory. When running in a distributed-memory environment, accessing shared-remote memory leads to implicit communication. For fine-grained accesses, which are frequently found in UPC programs, this communication overhead can significantly impact program performance. One solution for reducing the number of fine-grained accesses is to coalesce several accesses into a single access. This paper presents an analysis to identify opportunities for coalescing and an algorithm that allows the compiler to automatically coalesce accesses to shared-remote memory in UPC. It also describes how opportunities for coalescing can be created by the compiler through loop unrolling. Results obtained from coalescing accesses in manually-unrolled parallel loops are presented to demonstrate the benefit of combining parallel loop unrolling and communication coalescing.

  • Scalable RDMA performance in PGAS languages

     Farreras Esclusa, Montserrat; Almási, George; Cortés, Toni
    IEEE International Parallel and Distributed Processing Symposium
    p. 1-12
    Presentation's date: 2009-05-25
    Presentation of work at congresses

    View View Open in new window  Share Reference managers Reference managers Open in new window

  • Multidimensional blocking in UPC

     Barton, Christopher; Cascaval, Calin; Almási, George; Garg, Rahul; Amaral, José Nelson; Farreras Esclusa, Montserrat
    Lecture Notes in Computer Science
    Vol. 5234, p. 47-62
    DOI: 10.1007/978-3-540-85261-2_4
    Date of publication: 2008-02
    Journal article

    Read the abstract Read the abstract View View Open in new window  Share Reference managers Reference managers Open in new window

    Partitioned Global Address Space (PGAS) languages offer an attractive, high-productivity programming model for programming large-scale parallel machines. PGAS languages, such as Unified Parallel C (UPC), combine the simplicity of shared-memory programming with the efficiency of the message-passing paradigm by allowing users control over the data layout. PGAS languages distinguish between private, shared-local, and shared-remote memory, with shared-remote accesses typically much more expensive than shared-local and private accesses, especially on distributed memory machines where shared-remote access implies communication over a network. In this paper we present a simple extension to the UPC language that allows the programmer to block shared arrays in multiple dimensions. We claim that this extension allows for better control of locality, and therefore performance, in the language. We describe an analysis that allows the compiler to distinguish between local shared array accesses and remote shared array accesses. Local shared array accesses are then transformed into direct memory accesses by the compiler, saving the overhead of a locality check at runtime. We present results to show that locality analysis is able to significantly reduce the number of shared accesses.

  • Optimizing programming models for massively parallel computers  Open access

     Farreras Esclusa, Montserrat
    Department of Computer Architecture, Universitat Politècnica de Catalunya
    Theses

    Read the abstract Read the abstract Access to the full text Access to the full text Open in new window  Share Reference managers Reference managers Open in new window

    Since the invention of the transistor, clock frequency increase was the primary method of improving computing performance. As the reach of Moore's law came to an end, however, technology driven performance gains became increasingly harder to achieve, and the research community was forced to come up with innovative system architectures. Today increasing parallelism is the primary method of improving performance: single processors are being replaced by multiprocessor systems and multicore architectures. The challenge faced by computer architects is to increase performance while limited by cost and power consumption. The appearance of cheap and fast interconnection networks has promoted designs based on distributed memory computing. Most modern massively parallel computers, as reflected by the Top 500 list, are clusters of workstations using commodity processors connected by high speed interconnects. Today's massively parallel systems consist of hundreds of thousands of processors. Software technology to program these large systems is still in its infancy. Optimizing communication has become a key to overall system performance. To cope with the increasing burden of communication, the following methods have been explored: (i) Scalability in the messaging system: The messaging system itself needs to scale up to the 100K processor range. (ii) Scalable algorithms reducing communication: As the machine grows in size the amount of communication also increases, and the resulting overhead negatively impacts performance. New programming models and algorithms allow programmers to better exploit locality and reduce communication. (iii) Speed up communication: reducing and hiding communication latency, and improving bandwidth. Following the three items described above, this thesis contributes to the improvement of the communication system (i) by proposing a scalable memory management of the communication system, that guarantees the correct reception of data and control-data, (ii) by proposing a language extension that allows programmers to better exploit data locality to reduce inter-node communication, and (iii) by presenting and evaluating a cache of remote addresses that aims to reduce control-data and exploit the RDMA native network capabilities, resulting in latency reduction and better overlap of communication and computation. Our contributions are analyzed in two different parallel programming models: Message Passing Interface (MPI) and Unified Parallel C (UPC). Many different programing models exist today, and the programmer usually needs to choose one or another depending on the problem and the machine architecture. MPI has been chosen because it is the de facto standard for parallel programming in distributed memory machines. UPC was considered because it constitutes a promising easy-to-use approach to parallelism. Since parallelism is everywhere, programmability is becoming important and languages such as UPC are gaining attention as a potential future of high performance computing. Concerning the communication system, the languages chosen are relevant because, while MPI offers two-sided communication, UPC relays on a one-sided communication model. This difference potentially influences the communication system requirements of the language. These requirements as well as our contributions are analyzed and discussed for both programming models and we state whether they apply to both programming models.

  • Spare The Rod and Spoil the Child

     Farreras Esclusa, Montserrat; Alex, Audi
    Active Learning in Engineering Education Workshop
    p. 293-301
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Multidimensional Blocking in UPC

     Christopher, Barton; Cascaval, Calin; Almasi, George; Rahul, Garg; Nelson, Jose; Farreras Esclusa, Montserrat
    The 20th International Workshop on Languages and Compilers for Parallel Computing
    p. 1-15
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Ajut UPC-2006 per projectes específics dins la planificació estratègica

     Bofill Soliguer, Pau; Farreras Esclusa, Montserrat; Otero Calviño, Beatriz; Toribio Millan, Eliezer
    Award or recognition

     Share

  • Ajuts ICE de millora a la docència

     Bofill Soliguer, Pau; Farreras Esclusa, Montserrat; Otero Calviño, Beatriz; Toribio Millan, Eliezer
    Award or recognition

     Share

  • HPC Challenge Award: Best Productivity in Performance

     Farreras Esclusa, Montserrat; Cacaval, C; Almási, G; Dózsa, G; Luk, P; Spelce, T; Barton, C; Tiotto, Ettore
    Award or recognition

     Share

  • Scaling MPI to short-memory MPPs such as BG/L

     Farreras Esclusa, Montserrat; Cortes Rossello, Antonio; Labarta Mancho, Jesus Jose; Almasi, G
    20th ACM International Conference on Supercomputing (ISC'2006)
    p. 209-218
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Shared Memory Programming for Large Scale Machines

     Farreras Esclusa, Montserrat
    2006 Conference on Programming Language Design and Implementation
    p. 108-117
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • HPC Challenge award Best Productivity in Performance

     Farreras Esclusa, Montserrat; Cascaval, Calin; Barton, C; Almasi, G; Zheng, Y; Luk, P; and, R Mak
    Award or recognition

     Share

  • Las asignaturas de sistemas operativos en Ingenieria Electronica de la ETSETB-UPC: un mismo curso en modalidad semipresencial, presencial y por proyectos

     Bofill Soliguer, Pau; Farreras Esclusa, Montserrat; Maribel, March; March Hermo, Maria Isabel; Morancho Llena, Enrique
    XVI Jornadas de Paralelismo. CEDI 2005 I Congreso Español de Informática.
    p. 717-723
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • TIN2004-07739-C02-01 Computación de Altas Prestaciones IV: Arquitecturas, Compiladores, Sistemas Operativos, Herramientas y Aplicaciones

     Valero Cortes, Mateo; Utrera Iglesias, Gladys Miriam; Martorell Bofill, Xavier; Muntés Mulero, Víctor; Gil Gómez, Maria Luisa; Ramirez Bellido, Alejandro; Alvarez Martinez, Carlos; Torres Viñals, Jordi; Farreras Esclusa, Montserrat; Gallardo Gomez, Antonia; Herrero Zaragoza, José Ramón; Guitart Fernández, Jordi; Parcerisa Bundó, Joan Manuel; Morancho Llena, Enrique; Salamí San Juan, Esther; Canal Corretger, Ramon; Moreto Planas, Miquel
    Competitive project

     Share

  • Predicting MPI Buffer Addresses

     Freitag, Felix; Farreras Esclusa, Montserrat; Cortes Rossello, Antonio; Labarta Mancho, Jesus Jose
    Lecture notes in computer science
    Vol. 3036, p. 10-17
    Date of publication: 2004-06
    Journal article

     Share Reference managers Reference managers Open in new window

  • Predicting MPI Buffer Addresses

     Freitag, Felix; Farreras Esclusa, Montserrat; Cortes Rossello, Antonio; Labarta Mancho, Jesus Jose
    Computational Science - ICCS 2004
    p. 10-17
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Exploring the predictability of MPI messages

     Freitag, Felix; Caubet Serrabou, Jordi; Farreras Esclusa, Montserrat; Cortes Rossello, Antonio; Labarta Mancho, Jesus Jose
    IEEE International Parallel and Distributed Processing Symposium
    p. 69-76
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Exploring the Predictability of MPI Messages

     Freitag, Felix; Caubet Serrabou, Jordi; Farreras Esclusa, Montserrat; Cortes Rossello, Antonio; Labarta Mancho, Jesus Jose
    IEEE International Parallel and Distributed Processing Symposium
    p. 10-17
    Presentation of work at congresses

     Share Reference managers Reference managers Open in new window

  • Exploring the Predictability of MPI Messages

     Freitag, Felix; Caubet Serrabou, Jordi; Farreras Esclusa, Montserrat; Cortes Rossello, Antonio; Labarta Mancho, Jesus Jose
    Date: 2002-10
    Report

     Share Reference managers Reference managers Open in new window

  • Computación de Altas Prestaciones III: Arquitecturas, Compiladores, Sistemas Operativos, Herramientas y Algoritmos, ref. TIC2001-0995-C02-01

     Utrera Iglesias, Gladys Miriam; Valero Cortes, Mateo; Martorell Bofill, Xavier; Muntés Mulero, Víctor; Gil Gómez, Maria Luisa; Ramirez Bellido, Alejandro; Alvarez Martinez, Carlos; Torres Viñals, Jordi; Farreras Esclusa, Montserrat; Herrero Zaragoza, José Ramón; Guitart Fernández, Jordi; Parcerisa Bundó, Joan Manuel; Morancho Llena, Enrique; Salamí San Juan, Esther; Marín Tordera, Eva; Canal Corretger, Ramon
    Competitive project

     Share

  • Inovative OpenMP tool for Non-Experts (INTONE)

     Farreras Esclusa, Montserrat; Labarta Mancho, Jesus Jose
    Competitive project

     Share