Loading...
Loading...

Go to the content (press return)

Towards resilient EU HPC systems: A blueprint

Author
Radojkovic, Petar; Marazakis, M.; Carpenter, P.; Jeyapaul, R.; Gizopoulos, D.; Schulz, M.; Armejach, A.; Ayguade, E.; Canal, R.; Moreto, M.; Salami, B.; Unsal, O.
Type of activity
Report
Date
2020-04
Project funding
Cross-layer early reliability evaluation for the computing continuum
Fast virtual SoC for advanced GPS algorithm evaluation
REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems
Repository
http://hdl.handle.net/2117/330695 Open in new window
URL
https://resilienthpc.eu/ Open in new window
Abstract
This document aims to spearhead a Europe-wide discussion on HPC system resilience and to help the European HPC community define best practices for resilience. We analyse a wide range of state-of-the-art resilience mechanisms and recommend the most effective approaches to employ in large-scale HPC systems. Our guidelines will be useful in the allocation of available resources, as well as guiding researchers and research funding towards the enhancement of resilience approaches with the highest pri...
Citation
Radojkovic, P. [et al.]. Towards resilient EU HPC systems: A blueprint. 2020.
Group of research
CAP - High Performace Computing Group
VIRTUOS - Virtualisation and Operating Systems

Participants