Codes & Software Packages
This page lists the codes and software packages provided by each of the CoEs.
Bioexcel
 Applications for biomolecular modelling and simulations
 BioExcel works together with the core developers of widely used tools for biomolecular modeling and simulations:
GROMACSis a molecular dynamics package mainly designed for simulations of proteins, lipids, and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers worldwide. GROMACS is one of the fastest and most popular software packages available, and can run on central processing units (CPUs) and graphics processing units (GPUs). It is free, opensource software released under the GNU General Public License (GPL), and starting with version 4.6, the GNU Lesser General Public License (LGPL). GROMACS (http://www.gromacs.org) is one of the major software packages for the simulation of biological macromolecules. It is aimed at performing the simulation of large, biologically relevant systems, with a focus on both being efficient and flexible to allow the research of a number of different systems . The program has been used by research groups all around the globe, with several hundred publications being based directly or indirectly on it published during the last few years.HADDOCKis a versatile informationdriven flexible docking approach for the modelling of biomolecular complexes. HADDOCK distinguishes itself from abinitio docking methods in the fact that it can integrate information derived from biochemical, biophysical or bioinformatics methods to enhance sampling, scoring, or both. The information that can be integrated is quite diverse: interface restraints from NMR or MS, mutagenesis experiments, or bioinformatics predictions; various orientational restraints from NMR and, recently, cryoelectron maps. Currently, HADDOCK allows the modelling of large assemblies consisting of up to 6 different molecules, which together with its rich data support, provides a truly integrative modelling platform.CP2Kis a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. CP2K provides a general framework for different modeling methods such as DFT using the mixed Gaussian and plane waves approaches GPW and GAPW. Supported theory levels include DFTB, LDA, GGA, MP2, RPA, semiempirical methods (AM1, PM3, PM6, RM1, MNDO, …), and classical force fields (AMBER, CHARMM, …). CP2K can do simulations of molecular dynamics, metadynamics, Monte Carlo, Ehrenfest dynamics, vibrational analysis, core level spectroscopy, energy minimization, and transition state optimization using NEB or dimer method. (Detailed overview of features.) CP2K is written in Fortran 2008 and can be run efficiently in parallel using a combination of multithreading, MPI, and CUDA. It is freely available under the GPL license. It is therefore easy to give the code a try, and to make modifications as needed.QM/MM with GROMACS & CP2KMost biochemical systems, such as enzymes, are too large to be described at any level of ab initio or density functional theory. At the same time, the available molecular mechanics force fields are not sufficiently flexible to model processes in which chemical bonds are broken or formed. To overcome the limitations of a full quantum mechanical description on the one hand, and a full molecular mechanics treatment on the other hand, methods have been developed that treat a small part of the system at the level of quantum chemistry (QM), while retaining the computationally cheaper force field (MM) for the larger part. This hybrid QM/MM strategy was originally introduced by Warshel and Levitt more than four decades ago and is illustrated in the figure below.PMXis a service for users who need to do free energy calculations. Free energy calculations are extremely common in life sciences research. In molecular dynamics simulations, such as investigating how mutations affect protein function, these calculations provide insight into stability and affinity changes. One important branch of free energy calculations involve alchemical transformations such as the mutation of amino acids, nucleic acids or ligand modifications. A challenging aspect of these calculations is the creation of associated structures and molecular topologies. pmx provides an automated framework for the introduction of amino acid mutations in proteins. Several state of the art force fields are supported that can be used in the GROMACS molecular dynamics package.CPMDcode is a parallelized plane wave/pseudopotential implementation of Density Functional Theory, particularly designed for abinitio molecular dynamics. CPMD is currently the most HPC efficient code that allows performing quantum molecular dynamics simulations by using the CarParrinello molecular dynamics scheme. CPMD simulations are usually restricted to systems of few hundred atoms. In order to extend its domain of applicability to (much) larger biologically relevant systems, a hybrid quantum mechanical/molecular mechanics (QM/MM) interface, employing routines from the GROMOS96 molecular dynamics code, has been developed.
 Integrated workflows for portable and flexible solutions including BioBB and CWL
ChEESE
 ChEESE operates in the following areas: – Urgent seismic simulations Faster Than RealTime (FTRT) – Tsunami Simulations – Highresolution volcanic plume simulation – Physicsbased tsunamiearthquake interaction – Physicsbased probabilistic seismic hazard assessment (PSHA) – Probabilistic Volcanic Hazard Assessment (PVHA) – Probabilistic tsunami hazard assessment (PTHA) – Probabilistic Tsunami Forecast (PTF) for early warning and rapid post event assessment – Seismic tomography – Arraybased statistical source detection and restoration and Machine learning from earthquake/volcano slowearthquakes monitoring – Geomagnetic forecasts – Highresolution volcanic ash dispersal forecasts
 Computational Seismology:
ExaHypeengine supports simulation of systems of hyperbolic PDEs, as stemming from conservation laws. A concrete model for seismic wave propagation problems is being developed within the ExaHyPE project. The model is based on highorder Discontinuous Galerkin (DG) discretization, local timestepping and works on octreestructured Cartesian meshes. The activities in this CoE will focus on setting up concrete services based on the ExaHyPE engine and seismic models.SalvusHighperformance package for waveform modelling and inversion with applications ranging from laboratory ultrasound studies to planetaryscale seismology. Solves dynamic (visco)acoustic and elastic wave propagation problems on fully unstructured hypercubic and simplicial meshes in 2 and 3 dimensions using a spectralelement approach.SeisSolsolves seismic wave propagation (elastic, viscoelastic) and dynamic rupture problems on heterogeneous 3D models. SeisSol uses highorder DG discretization and local timestepping on unstructured adaptive tetrahedral meshes. Scalable performance at Petascale has been demonstrated up to several thousand nodes (on several supercompers, e.g., Cori, SuperMUC, Hazel Hen, Shaheen, etc.). Earlier work considered offload schemes that scaled to 8000 nodes on the Tianhe2 supercomputer (Xeon Phi, Knights Corner).SPECFEM3Dsolves linear seismic wave propagation (elastic, viscoelastic, poroelastic, fluidsolid) and dynamic rupture problems in heterogeneous 3D models. SPECFEM3D also implements imaging and FWI for such complex models based on an LBFGS (BroydenFletcherGoldfarbShanno) algorithm. Based on the highorder spectralelement (CG) discretization for unstructured hexahedral meshes. Scalable performance at Petascale (runs on the largest machines worldwide: Titan and Summit at Oak Ridge, Piz Daint, CURIE, K computer, etc.)
 MHD:
PARODY_PDAFsimulates incompressible MHD in a spherical cavity. In addition to the NavierStokes equations with an optional Coriolis force, it can also timestep the coupled induction equation for MHD (with imposed magnetic field or in a dynamo regime), as well as the temperature (and concentration) equation in the Boussinesq framework. It offers the possibility to perform ensemble assimilation experiments, being connected with the parallel data assimilation framework (PDAF) library. Based on a semispectral approach combining finite differences in radius and spherical harmonics, semiimplicit secondorder time scheme.XSHELLSsimulates incompressible fluids in a spherical cavity. In addition to the NavierStokes equation with an optional Coriolis force, it can also timestep the coupled induction equation for MHD (with imposed magnetic field or in a dynamo regime), as well as the temperature (and concentration) equation in the Boussinesq framework. Based also on a semispectral approach combining finite differences in radius and spherical harmonics, semiimplicit secondorder time scheme.
 Tsunami Modelling:
THySEAMultiphase fluid dynamic model conceived for compressible mixtures composed of gaseous components and solid particle phases. All phases are treated using the Eulerian approach, identifying a solid phase as a class of particles with similar dynamical properties. The physical model is based on the equilibriumEulerian approach while the gasparticle momentum nonequilibrium is approximated by a prognostic equation accurate to the first order. As a result, the model reduces to a single (vector) momentum equation and one energy equation for the mixture (corrected to account for nonequilibrium terms and mixture density fluctuations) and one continuity equation for each gaseous or solid component. In addition, Lagrangian particles are injected in the domain and “two/fourway” coupled with the Eulerian field. Discretization is based on the FiniteVolume method on an unstructured grid. The numerical solution adopts a segregated semiimplicit approach.FALL3Dis a Eulerian model for the atmospheric transport and ground deposition of volcanic tephra (ash). FALL3D solves a set of advectiondiffusionsedimentation (ADS) equations on a structured terrainfollowing grid using a secondorder Finite Differences (FD) explicit scheme.
 Physical Volcanology:
ASHEEsolves the 2D shallow water equations on hydrostatic and dispersive versions. THySEA is based on a highorder Finite Volume (FV) discretization (hydrostatic) with Finite Differences (FD) for the dispersive version on twoway structured nested meshes in spherical coordinates. Initial conditions from the Okada model or initial deformation, synchronous and asynchronous multiOkada, rectangular and triangular faults.LHySEAsolves the 2D shallow water/SavageHutter coupled equations, hydrostatic fully coupled and dispersive weakly coupled versions. LHySEA is based on highorder FV discretization (hydrostatic) with FD for dispersive model, structured meshes in Cartesian coordinates.
CompBioMed
 The CompBioMed Software Hub addresses the needs of the computational biomedicine research community, which can use the Hub to access the resources developed, aggregated and coordinated by CompBioMed: CompBioMed Software: Cardiovascular Medicine CompBioMed Software: Molecularlybased medicine CompBioMed Software: Neuromusculoskeletal Medicine
 Codes:
– Cardiovascular:
Alyadeveloped by the team of Mariano Vazquez and Guillaume Houzeaux at the Barcelona Supercomputing Centre, performs cardiac electromechanics simulations, from tissue to organ level. The simulation involves the solution of multiscale model using a FEMbased electromechanical coupling solver, specifically optimised for the efficient use of supercomputing resources. Alya is available for use to research users on MareNostrum, ARCHER, and Cartesius; for clinical and industrial users, BSC recommends users access it as a service, due to the complexity involved with setting up simulations. To this purpose BSC is setting up a spinoff (ELEM Biotech) that will provide commercial softwareasaservice to biomedical industries based on Alya.HemeLBdeveloped by the team of Prof Peter Coveney at University College London (UK), is a software pipeline that simulates the blood flow through a stent (or other flow diverting device) inserted in a patient’s brain. The aim is to discover how different stent designs (surface patterns) affect the stress the blood applies to the blood vessel, in particular in the region of the aneurysm being treated. The pipeline also allows the motion of magnetically steered particles, for example coated with drugs, to be simulated and estimates made as to where they might statistically end up. The HemeLB setup tool voxelises the geometry at the given resolution, and HemeLB (latticeBoltzmann CFD solver) then simulates the fluid flow within that geometry, using the given velocitytime profiles for each inlet. Once complete, the simulation output is analysed using the hemeXtract utility, which can produce images of crosssectional flow, or 3D shots of wall shear stress distribution in the geometry using ParaView visualisation software. HemeLB is installed, optimised, and available for use to any user with a valid account and CPUtime on ARCHER, Cartesius, SuperMUC, Prometheus and Blue Waters. The UCL team also provide consulting to biomedical companies and clinical users.HemoCelldeveloped by the team of Prof Alfons Hoekstra at the University of Amsterdam (NL), is a highperformance library to simulate the transport properties of dense cellular suspensions, such as blood. It contains validated material model for red blood cells and additional support for further cell types (white blood cells, platelets). The blood plasma is represented as a continuous fluid simulated with an opensource Lattice Boltzmann Method (LBM) solver. The cells are represented as Discrete Element Method (DEM) membranes coupled to the plasma flow through a tested inhouse immersedboundary implementation. HemoCell is computationally capable of handling a large domain size with a high number of cells (> 10^410^6 cells). The code is currently installed and optimised for Cartesius, Lisa, and SuperMUC (Leibniz Supercomputing Centre system), and can be used by anyone with a valid account and CPU allocation on any of these systems.openBFis an opensource 1D blood flow solver based on MUSCL finitevolume numerical scheme, written in Julia and released under Apache 2.0 free software license. The software is developed by Alessandro Melis and Alberto Marzo at the Insigneo Institute at the university of Sheffield (UK). The solution is currently exposed as open source software; it is also installed on SURFsara’s HPCCloud, where it is used for largescale sensitivity analysis and uncertainty quantification studies.Palabosis Lattice Boltzmann Method (LBM) solver, available as open source, and massively parallel. The team of Prof Bastien Chopard at University of Geneva (CH) has specialised it to solve a number of relevant biomedical problems, including simulation of blood flow, and bone cement penetration during vertebroplasty. The software has specific features to deal with biomedical problems, reading medical images. Palabos was tested on CADMOS BlueGene/Q (Switzerland), UniGe Baobab (Switzerland).
 Palabos – Vertebroplasty Simulator: This solution, currently in its final stage of development, uses Palabos to provide a vertical solution for the preoperative planning of vertebroplasty. Micro CT images of the damaged vertebral body are converted into an LBM model, which simulates multiple cement injections with different access point and cement volume. The simulation results predict exact filling patterns of the injected cement. Plans of future developments include converting the results into a finite element model, which will predict the increase in biomechanical strength with respect to the untreated vertebra.
 Palabos – Flow Diverter Simulator: This solution, currently in its final stage of development, uses Palabos to provide a vertical solution for the preoperative planning for the insertion of flow diverters. CT scan images of blood vessels with aneurysms or other anomalies are converted into an LBM model. Different types of flow diverters are numerically inserted to test their impact on the blood flow pattern. Simulation output includes wall shear stress distribution in the aneurysm to predict the rate of blood clotting.
PolNetis a software tool for the computer simulation of blood flow in realistic microvascular networks imaged with a wide variety of microscopy and clinical imaging techniques. To date, PolNet has contributed to: a) uncovering the relationship between blood flow and blood vessel biology and its importance for correct vascularisation of tissues, and b) developing ways of predicting retinal vascular damage in diabetic retinopathy patients. PolNet facilitates the adoption of cuttingedge computer simulation technology by nonexperts in the Biosciences.InSilicoMRIprovides a framework to predict the overheating of a medical device during an MRI scan. This software allows to evaluate the radiofrequency (RF) safety analysis of a passive device exposed to a 3T MRI birdcage coil field following the directives of ASTM F2182 standard. The simulation calculates the EM fields, SAR, and thermal heating after 900s of RF exposure. This application use Microsoft Azure cloud services to run the simulation.SIMULIALiving Heart Human Model is a highfidelity multiphysics model of a healthy, 4chamber adult human heart and proximal vasculature. The dynamic response of the Living Heart is governed by realistic electrical, structural, and fluid (blood) flow physics. With this model, medical professionals, researchers, and device manufacturers will be able to rapidly conduct virtual experiments in a highly realistic 3D environment. The Living Heart can readily be used to study cardiac defects or diseased states and explore treatment options by modifying its geometry, loading, or electromechanical properties. In addition, medical devices can be inserted into the model to study their influence on cardiac function, validate their efficacy, and predict their reliability under a wide range of operating conditions.The Binding Affinity Calculator (BAC)developed by the team of Prof Peter Coveney at University College London (UK), is a workflow tool that runs and analyses simulations designed to assess how well drugs bind to their target proteins and the impact of changes to those proteins. It is a collection of scripts which wrap around common molecular dynamics codes to facilitate free energy calculations. Use of ensemble simulations to robust, accurate and precise free energy computations from both alchemical and endpoint analysis methodologies. BAC is a fairly complex tool to use, so at the moment the development team at UCL have made it available as part of consulting services or research collaborations. However, EnsembleMD provides userfriendly interfaces to related binding affinity calculation services, which will be made available as an App in the online store of associate partner DNAnexus; a beta version is being used by pharma.HTMDdeveloped by the team of Prof. Gianni de Fabritiis at the Universitat Pompeu Fabra (ES), is a programmable environment to prepare, execute, visualize and analyse Molecular Dynamic simulations in HPC or HTC systems, including AWS. It is a Pythonbased programmable environment to perform system preparation and building, execution of simulations with different MD codes using adaptive sampling schemes and generate Markov State models to analyse simulations. The code is now maintained by Acellera; it is distributed commercially, but it remains free for academic users.Playmoleculedeveloped by the team of Prof Gianni de Fabritiis at the Universitat Pompeu Fabra (ES), is an intuitive platform to access a diverse set of web applications for molecular research. It is a repository of free bestinkind applications with a diverse set of solutions like molecular predictors and modelling tools. Simulations are run on GPUGRID for free or via Amazon AWS; The scalability is provided by Amazon via acecloud, the cloud interfacing software by Acellera.Visual GECis a software tool for designing engineered cells and simulating biochemical interactions. The Genetic Engineering of Cells (GEC) software, developed by the Biological Computation team at Microsoft Research (Cambridge, UK), is a modelling tool that can be used to design and simulate synthetic genetic circuits. At the core is a domainspecific programming language for biochemical systems (LBS), originally developed at the University of Edinburgh. The tool supports stochastic and deterministic simulation of the temporal dynamics of chemical reaction networks, but also spatiotemporal dynamics via reactiondiffusion equations. Parameter inference can also be performed using MetropolisHastings Markov chain Monte Carlo with timeseries data.The High throughput binding affinity calculator (HTBAC)is a scalable solution for adaptive personalised drug discovery. HTBAC it uses high level python object abstractions for defining simulations, physical systems and ensemblebased free energy protocols. The Runner class as part of the HTBAC abstraction uses underlying building blocks middleware developed by the RADICAL team to create and execute multiple concurrent executions of protocols on supercomputing cyberinfrastructures while abstracting and handling execution management, and data transfer.Virtual Assaysoftware provides a framework to run in silico drug trials in populations of human cardiac cell models for predictions of drug safety and efficacy. Virtual Assay starts with wellunderstood human cellular biology models and modulates the variables to generate a range, or population, of models, which will respond differently to the same inputs. These populations are then calibrated against experimental data, retaining only those models in Calibrated Model Populations range with experimental observations. Once calibrated, these populations can be used to analyse the effects of different pharmaceutical agents on cellular response at the population level.Computer Tomography to Strength (CT2S)is an online service developed by the team of Prof Marco Viceconti at the Insigneo Institute at the University of Sheffield (UK), which allows the prediction of the biomechanical strength of a patient’s bone from a clinical CT scan of that bone. The service operates by creating a patientspecific finite element model of the bone, using a state of the art imageprocessing pipeline. This very precise model of the patient’s anatomy is then examined under a range of highly realistic simulated loading conditions, including walking, running, stairclimbing and falling, and the fracture load is computed in each case. Data summarising the identified fracture strength is returned to the user. The solution is currently exposed as a service, accessible through a web interface; the backend HPC system currently in use is USFD’s own ShARC. The service is currently provided at cost, with a significant discount for nonsponsored clinical studies. USFD is currently exploring the best marketing strategy.Insigneo Bone Tissue SuitThis is a collection of modelling tools developed by the teams of Prof Marco Viceconti, Dr Shannon Li, and Dr Enrico Dall’Ara, at the Insigneo Institute at the university of Sheffield (UK), with the collaboration of Dr Francesc Levrero Florencio (Oxford), Prof Pankaj Pankaj (Edinburgh) and Prof Lee Margetts (Manchester). Starting from microCT or NanoCT datasets of bone tissue, the suit provides tools for: – MicroMesh: Automatic generation of Cartesian 8node hexahedral finite element meshes from microCT data, using both homogenous, or densitybased heterogeneous material mapping; – MicroFE: largescale micro finite element solver, based on the ParaFEM library for large displacement, large strain, simulations of bone tissue micromechanics; – BoneDVC: Digital Volume Correlation code that computes the displacement field induced in bone tissue specimens subjected to staged compression. The Insigneo Bone Tissue Suit will enable a complete modelling and validation cycle on very largescale datasets generated with SnmicroCT with resolutions of up to 4000^3 voxels. The code is installed, optimised, and is accessible to any user with a valid account and CPUtime on the ShARC and Archer HPC systems.PalabosCodesis Lattice Boltzmann Method (LBM) solver, available as open source, and massively parallel. The team of Prof Bastien Chopard at University of Geneva (CH) has specialised it to solve a number of relevant biomedical problems, including simulation of blood flow, and bone cement penetration during vertebroplasty. The software has specific features to deal with biomedical problems, reading medical images. Palabos was tested on CADMOS BlueGene/Q (Switzerland), UniGe Baobab (Switzerland). Palabos – Vertebroplasty Simulator: This solution, currently in its final stage of development, uses Palabos to provide a vertical solution for the preoperative planning of vertebroplasty. Micro CT images of the damaged vertebral body are converted into an LBM model, which simulates multiple cement injections with different access point and cement volume. The simulation results predict exact filling patterns of the injected cement. Plans of future developments include converting the results into a finite element model, which will predict the increase in biomechanical strength with respect to the untreated vertebra. Palabos – Flow Diverter Simulator: This solution, currently in its final stage of development, uses Palabos to provide a vertical solution for the preoperative planning for the insertion of flow diverters. CT scan images of blood vessels with aneurysms or other anomalies are converted into an LBM model. Different types of flow diverters are numerically inserted to test their impact on the blood flow pattern. Simulation output includes wall shear stress distribution in the aneurysm to predict the rate of blood clotting.
ECAM
 The activity of ECAM focuses on the need for new and improved algorithms and code modules. Recent advances in computational power depend on either massive parallelism, or specialist hardware accelerators, or increasingly both; this means that the old legacy codes need to be rewritten to exploit these possibilities and, in many cases,that totally new algorithms have to be implemented. Frameworks, tools, documentation and standards need to be developed to allow better use of the creativity of programmers and the extraordinary success of many freesoftware projects in using distributed networks of volunteer programmers needs to be replicated in the sphere of scientific software. Particularly, it operates in the following areas:
 Molecular dynamics:
LAMMPS(Largescale Atomic/Molecular Massively Parallel Simulator) is a very versatile MD engine. The current stable release of which can be obtained from the link LAMMPS stable 29Oct2020. In particular, due to it’s powerful library and python interface it allows great control and easy scripting of any time of simulation. The Particle Insertion approach for alchemical free energy calculations is currently only implemented using the LAMMPS MD engine. The set of patches in this module exist to accommodate the requirements of those modules.GROMACS(http://www.gromacs.org) is one of the major software packages for the simulation of biological macromolecules. It is aimed at performing the simulation of large, biologically relevant systems, with a focus on both being efficient and flexible to allow the research of a number of different systems (see the examples provided further down the page). The program has been used by research groups all around the globe, with several hundred publications being based directly or indirectly on it published during the last few yearsOpenPathSampling OPSA Python library to facilitate path sampling algorithms. OpenPathSampling (OPS) makes it easy to perform many variants of transition path sampling (TPS) and transition interface sampling (TIS), as well as other useful calculations for rare events, such as committor analysis and flux calculations. In addition, it is a powerful library to build new path sampling methods. OPS is independent of the underlying molecular dynamics engine, and currently has support for OpenMM and Gromacs, as well as an internal engine suitable for 2D toy models.Nanoscale Molecular Dynamics(NAMD, formerly Not Another Molecular Dynamics Program) is computer software for molecular dynamics simulation, written using the Charm++ parallel programming model. It is noted for its parallel efficiency and is often used to simulate large systems (millions of atoms).It has been developed by the collaboration of the Theoretical and Computational Biophysics Group (TCB) and the Parallel Programming Laboratory (PPL) at the University of Illinois at Urbana–Champaign. It was introduced in 1995 by Nelson et al. as a parallel molecular dynamics code enabling interactive simulation by linking to the visualization code VMD. NAMD has since matured, adding many features and scaling beyond 500,000 processor cores. NAMD has an interface to quantum chemistry packages ORCA and MOPAC, as well as a scripted interface to many other quantum packages. Together with Visual Molecular Dynamics (VMD) and QwikMD, NAMD’s interface provides access to hybrid QM/MM simulations in an integrated, comprehensive, customizable, and easytouse suite. NAMD is available as freeware for noncommercial use by individuals, academic institutions, and corporations for inhouse business uses.
 Electronic Structure:
Electronic Structure Library (ESL)was initiated by CECAM (the European Centre for Atomic and Molecular Calculations) to catalyze a paradigm shift away from the monolithic model and promote modularization, with the ambition to extract common tasks from electronic structure codes and redesign them as opensource libraries available to everybody. Such libraries include “heavyduty” ones that have the potential for a high degree of parallelization and adaptation to novel hardware within them, thereby separating the sophisticated computer science aspects of performance optimization and reengineering from the computational science done by, e.g., physicists and chemists when implementing new ideas.ELSIis a software bundle and unified interface for methods that solve or circumvent eigenvalue problems in electronic structure theory – for example, in the selfconsistent field cycle of densityfunctional theory, but also elsewhere. An interface to BSEpack to solve the BetheSalpeter Equation is also included. In essence, ELSI will allow an electronic structure code to run its eigenvalue and/or density matrix solutions through a single interface, leaving the details of handling the different libraries that are actually used to solve the problem to ELSI. Thus, switching between (say) Lapack, ELPA, MAGMA, or an O(N) method becomes much simpler (no more need to write customized interfaces for each of them).Wannier90is a code for computing the maximallylocalized Wannier functions (MLWFs) of a system. It requires a separate electronic structure code to compute and provide information on the KohnSham energy bands. It can operate either as a standalone postprocessing utility, reading this information from files, or as a library to be called internally by the electronic structure code. Wannier90 has a number of features, including a disentanglement scheme for entangled energy bands, optimized Γpoint routines, plotting of the MLWFs, Wannier interpolation to obtain many spectral and Fermisurface properties at highresolution in the Brillouin zone, and others. A complete list of features is maintained here.QMCPACKis a modern highperformance opensource Quantum Monte Carlo (QMC) simulation code. Its main applications are electronic structure calculations of molecular, quasi2D and solidstate systems. Variational Monte Carlo (VMC), diffusion Monte Carlo (DMC) and a number of other advanced QMC algorithms are implemented. Orbital space auxiliary field QMC (AFQMC) has recently been added. By directly solving the Schrodinger equation, QMC methods offer greater accuracy than methods such as density functional theory, but at a tradeoff of much greater computational expense. QMCPACK is written in C++ and designed with the modularity afforded by objectoriented programming. It makes extensive use of template metaprogramming to achieve high computational efficiency. Due to the modular architecture, the addition of new wavefunctions, algorithms, and observables is relatively straightforward. For parallelization QMCPACK utilizes a fully hybrid (OpenMP,CUDA)/MPI approach to optimize memory usage and to take advantage of the growing number of cores per SMP node or GPUs. High parallel and computational efficiencies are achievable on the largest supercomputers. Finally, QMCPACK utilizes standard file formats for input and output in XML and HDF5 to facilitate data exchange.Quantum ESPRESSOis a suite for firstprinciples electronicstructure calculations and materials modeling, distributed for free and as free software under the GNU General Public License. It is based on densityfunctional theory, plane wave basis sets, and pseudopotentials (both normconserving and ultrasoft). ESPRESSO is an acronym for opEnSource Package for Research in Electronic Structure, Simulation, and Optimization.[2][3] The core plane wave DFT functions of QE are provided by the PWscf component, PWscf previously existed as an independent project. PWscf (PlaneWave SelfConsistent Field) is a set of programs for electronic structure calculations within density functional theory and density functional perturbation theory, using plane wave basis sets and pseudopotentials. The software is released under the GNU General Public License. The latest version QE6.6 was released on 5 Aug 2020.SIESTA(Spanish Initiative for Electronic Simulations with Thousands of Atoms) is an original method and its computer program implementation, to perform efficient electronic structure calculations and ab initio molecular dynamics simulations of molecules and solids. SIESTA’s efficiency stems from the use of strictly localized basis sets and from the implementation of linearscaling algorithms which can be applied to suitable systems. A very important feature of the code is that its accuracy and cost can be tuned in a wide range, from quick exploratory calculations to highly accurate simulations matching the quality of other approaches, such as planewave and allelectron methods. SIESTA’s backronym is Spanish Initiative for Electronic Simulations with Thousands of Atoms. Since 13 May 2016, with the 4.0 version announcement, SIESTA is released under the terms of the GPL opensource license. Source packages and access to the development versions can be obtained from the DevOps platform on GitLab.AiiDAis an opensource Python infrastructure to help researchers with automating, managing, persisting, sharing and reproducing the complex workflows associated with modern computational science and all associated data. AiiDA is built to support and streamline the four core pillars of the ADES model: Automation, Data, Environment, and Sharing.
 Quantum Dynamics:
PaPIMis a code for calculation of equilibrated system properties (observables). Some properties can be directly obtained from the distribution function of the system, while properties that depends on the exact dynamics of the system, such as the structure factor, [Mon2] infrared spectrum [Beu] or reaction rates, can be obtained from the evolution of appropriate time correlation functions. PaPIM samples either the quantum (Wigner) or classical (Boltzmann) density functions and computes approximate quantum and classical correlation functions. The code is highly parallelized and suitable for use on large HPC machines. The code’s modular structure enables an easy update/change of any of its modules. Furthermore the coded functionalities can be used independently of each other. The code is specifically design with simplicity and readability in mind to enable any user to easily implement its own functionalities. The code has been extensively used for the calculation of the infrared spectrum of the \text{CH}_{5}^{+} cation in gas phase, while recently new calculations on the water dimer, and protonated water dimer systems were started.Quanticsis suite of programs for molecular quantum dynamics simulations. The package is able to set up and propagate a wavepacket using the MCTDH method [Beck]. Numerically exact propagation is also possible for small systems using a variety of standard integration schemes [Lefo], as is the solution of the timeindependent Schrödinger equation using Lanczos diagonalisation. The program can also be used to generate a ground state wavefunction using energy relaxation (i.e. propagation in imaginary time) and with the “improved relaxation” it is even possible to generate (low lying) excited states. Within the Quantics package there are also programs to propagate density operators (by solving the Liouvillevon Neumann equation for open or closed system) [Mey], a program for fitting complicated multidimensional potential energy function, programs for determining bound or resonance energies by filterdiagonalisation, parameters of a vibronic coupling Hamiltonian, and many more. Recent developments include the use of Gaussian wavepacket based methods (GMCTDH) and interfaces to quantum chemistry programs such as Gaussian and Molpro allow direct dynamics calculations using the vMCG method [Ric]. The following modules are extension of Quantics functionalities developed at ECAM Extended Software Development Workshops.CP2Kis a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. CP2K provides a general framework for different modeling methods such as DFT using the mixed Gaussian and plane waves approaches GPW and GAPW. Supported theory levels include DFTB, LDA, GGA, MP2, RPA, semiempirical methods (AM1, PM3, PM6, RM1, MNDO, …), and classical force fields (AMBER, CHARMM, …). CP2K can do simulations of molecular dynamics, metadynamics, Monte Carlo, Ehrenfest dynamics, vibrational analysis, core level spectroscopy, energy minimization, and transition state optimization using NEB or dimer method. (Detailed overview of features.) CP2K is written in Fortran 2008 and can be run efficiently in parallel using a combination of multithreading, MPI, and CUDA. It is freely available under the GPL license. It is therefore easy to give the code a try, and to make modifications as needed.QChemis a generalpurpose electronic structure package featuring a variety of established and new methods implemented using innovative algorithms that enable fast calculations of large systems on various computer architectures, from laptops and regular lab workstations to midsize clusters and HPCC, using density functional and wavefunction based approaches. It offers an integrated graphical interface and input generator; a large selection of functionals and correlation methods, including methods for electronically excited states and openshell systems; solvation models; and wavefunction analysis tools. In addition to serving the computational chemistry community, QChem also provides a versatile code development platform.CPMDcode is a parallelized plane wave/pseudopotential implementation of Density Functional Theory, particularly designed for abinitio molecular dynamics. CPMD is currently the most HPC efficient code that allows performing quantum molecular dynamics simulations by using the CarParrinello molecular dynamics scheme. CPMD simulations are usually restricted to systems of few hundred atoms. In order to extend its domain of applicability to (much) larger biologically relevant systems, a hybrid quantum mechanical/molecular mechanics (QM/MM) interface, employing routines from the GROMOS96 molecular dynamics code, has been developed.ElVibRotGeneral quantum dynamics code using curvilinear coordinates and a numerical kinetic energy operator (with Tnum) : (i) Vibrational levels, intensities for floppy molecular system (ii) Wavepacket propagation with or witout time dependant Hamiltonian (iii) Quantum gate and optimal control.
 Meso and multiscale modelling:
MP2CESPResSo++is a software package for the scientific simulation and analysis of coarsegrained atomistic or beadspring models as they are used in soft matter research. ESPResSo++ has a modern C++ core and flexible Python user interface. ESPResSo and ESPResSo++ have common roots however their development is independent and they are different software packages. ESPResSo++ is free, opensource software published under the GNU General Public License (GPL).DL_MESOis a general purpose mesoscale simulation package developed by Michael Seaton for CCP5 and UKCOMES under a grant provided by EPSRC. It is written in Fortran 2003 and C++ and supports both Lattice Boltzmann Equation (LBE) and Dissipative Particle Dynamics (DPD) methods. It is supplied with its own Javabased Graphical User Interface (GUI) and is capable of both serial and parallel execution.Grand Canonical Adaptive Resolution Scheme (GCAdResS)is getting more recognition throughout the scientific community. The main aspect is to couple two simulation boxes together and combine the advantages of classical atomistic simulations with those from coarse gained simulations. The goal of the pilot project is to develop a library or recipe with which GCAdResS can be implemented in any MD Code. Our current focus is adjusting the implemented Version of GCAdResS in GROMACS. The longterm goal of this project is to promote and stimulate the community to use it as a tool for multiscale simulations and Analysis.
EoCoE
 EoCoE drives its efforts into 5 scientific Exascale challenges in the lowcarbon sectors of energy: Meteorology, Materials, Water, Wind and Fusion. This multidisciplinary effort will harness innovations in computer science and mathematical algorithms within a tightly integrated codesign approach to overcome performance bottlenecks and to anticipate future HPC hardware developments. Challenging applications in selected energy sectors will be created at unprecedented scale, demonstrating the potential benefits to the energy industry, such as accelerated design of storage devices, highresolution probabilistic wind and solar forecasting for the power grid and quantitative understanding of plasma coreedge interactions in ITERscale tokamaks.
 Wind4energy:
Alyais a high performance computational mechanics code to solve complex coupled multiphysics / multiscale / multidomain problems, which are mostly coming from the engineering realm. Among the different physics solved by Alya we can mention: incompressible/compressible flows, nonlinear solid mechanics, chemistry, particle transport, heat transfer, turbulence modeling, electrical propagation, etc. From scratch, Alya was specially designed for massively parallel supercomputers, and the parallelization embraces four levels of the computer hierarchy. 1) A substructuring technique with MPI as the message passing library is used for distributed memory supercomputers. 2) At the node level, both loop and task parallelisms are considered using OpenMP as an alternative to MPI. Dynamic load balance techniques have been introduced as well to better exploit computational resources at the node level. 3) At the CPU level, some kernels are also designed to enable vectorization. 4) Finally, accelerators like GPU are also exploited through OpenACC pragmas or with CUDA to further enhance the performance of the code on heterogeneous computers. Multiphysics coupling is achieved following a multicode strategy, relating different instances of Alya. MPI is used to communicate between the different instances, where each instance solves a particular physics. This powerful technique enables asynchronous execution of the different physics. Thanks to a careful programming strategy, coupled problems can be solved retaining the scalability properties of the individual instances. The code is one of the two CFD codes of the Unified European Applications Benchmark Suite (UEBAS) as well as the Accelerator benchmark suite of PRACE.waLBerlazis a massively parallel simulation framework. It contains efficient, hardware specific compute kernels to get optimal performance on today’s supercomputing architectures. waLBerla employs a blockstructured partitioning of the simulation domain including support for grid refinement. These grid data structures make it easy to integrate various data parallel algorithms like Multigrid, CG, or phasefield models. waLBerla uses the lattice Boltzmann method (LBM), which is an alternative to classical NavierStokes solvers for computational fluid dynamics simulations. All of the common LBM collision models are implemented (SRT, TRT, MRT). Additionally, a coupling to the rigid body physics engine pe is available. waLBerla is written in C++, which allows for modular and portable software design without having to make any performance tradeoffs.
 Meteo4energy:
ESIASChemis a tool for generating and controlling ultralarge ensembles of chemistry transport models for stochastic integrationESIASMeteois a tool for generating and controlling ultralarge ensembles of numerical weather forecast models for stochastic integrationEURADIMsystem consists of 5 major parts: the meteorological driver WRF, the preprocessors EEP and PREP for preparation of anthropogenic emission data and observations, the EURADIM Emission Model EEM, and the chemistry transport model EURADIM (Hass et al., 1995, Memmesheimer et al., 2004). EURADIM is a Eulerian mesoscale chemistry transport model involving advection, diffusion, chemical transformation, wet and dry deposition and sedimentation of tropospheric trace gases and aerosols. It includes 3dvar and 4dvar chemical data assimilation (Elbern et al., 2007) and is able to run in nesting mode.
 Materials4energy:
MetalwallsA classical molecular dynamics software dedicated to the simulation of electrochemical systems. Its main originality is the inclusion of a series of methods whichallow a constant electrical potential to be applied to the electrode materials. It also allowsthe simulation of bulk liquids or solids using the polarizable ion model and the aspherical ionmodel.MetalWallsis designed to be used on highperformance computers and it has alreadybeen employed in a number of scientific publications. It was for example used to study thecharging mechanism of supercapacitors (Merlet et al.,2012), nanoelectrowetting (Choudhuriet al.,2016) and water desalination devices (Simoncelli et al.,2018).QMCPACKis a modern highperformance opensource Quantum Monte Carlo (QMC) simulation code. Its main applications are electronic structure calculations of molecular, quasi2D and solidstate systems. Variational Monte Carlo (VMC), diffusion Monte Carlo (DMC) and a number of other advanced QMC algorithms are implemented. Orbital space auxiliary field QMC (AFQMC) has recently been added. By directly solving the Schrodinger equation, QMC methods offer greater accuracy than methods such as density functional theory, but at a tradeoff of much greater computational expense. QMCPACK is written in C++ and designed with the modularity afforded by objectoriented programming. It makes extensive use of template metaprogramming to achieve high computational efficiency. Due to the modular architecture, the addition of new wavefunctions, algorithms, and observables is relatively straightforward. For parallelization QMCPACK utilizes a fully hybrid (OpenMP,CUDA)/MPI approach to optimize memory usage and to take advantage of the growing number of cores per SMP node or GPUs. High parallel and computational efficiencies are achievable on the largest supercomputers. Finally, QMCPACK utilizes standard file formats for input and output in XML and HDF5 to facilitate data exchange.gDFTBis a simulation tool to calculate the transport at both equilibrium and nonequilibrium conditions. Thecrystalline orientation, length, and arrangement of electrodes have very weak influenceon the electronic characteristics of the considered atomic wires. The wire width isfound to be the most effective geometric aspect determining the number of conduction channels. The obtained conductance oscillation and linear currentvoltage curves are interpreted. To analyze the conduction mechanism in detail, the transmission channels and their decomposition to the atomic orbitals are calculated in copper and goldsingle point contacts. gDFTB is a technique designed for applicationsin nanotechnology; applications that are pertinent to systems having components thatare both intrinsically molecular in nature, requiring treatment by quantum chemicaltechniques, and intrinsically macroscopic in nature, requiring treatment of integrated solid state electronicslibNEGFis a Fortran2008 library for Non Equilibrium Green’s Functions. It can be used to solve efficiently open boundary conditions problems for quantum transport in devices.KMC/DMCare general purpose programs for the simulation of chemical reactions taking place at crystal surfaces. The used simulation method is a Discrete Event Simulation with continuous time. In the literature this is commonly called a Dynamic Monte Carlo simulation (DMC) or Kinetic Monte Carlo simulation (KMC). The general purpose nature of the program is visible in a clear separation between model and (simulation) method. The simulation model is specified in terms of surface structure and changing patterns, reflecting the reactions taking place at the surface. Several methods can be employed on the model differing only in simulation speed and memory use.
 Water4energy:
ParFlowis a numerical model that simulates the hydrologic cycle from the bedrock to the top of the plant canopy. It integrates threedimensional groundwater flow with overland flow and plant processes using physicallybased equations to rigorously simulate fluxes of water and energy in complex realworld systems. ParFlow is a computationally advanced model that can run on laptops and supercomputers and has been used in hundreds of studies evaluating hydrologic processes from the hillslope to the continental scale.SHEMATSuiteis a finitedifference opensource code for simulating coupled flow, heat and species transport in porous media. The code, written in Fortran95, originates from geoscientific research in the fields of geothermics and hydrogeology. It comprises: (1) a versatile handling of input and output, (2) a modular framework for subsurface parameter modeling, (3) a multilevel OpenMP parallelization, (4) parameter estimation and data assimilation by stochastic approaches (Monte Carlo, Ensemble Kalman filter) and by deterministic Bayesian approaches based on automatic differentiation for calculating exact (truncation errorfree) derivatives of the forward code.ExaTerris a new development that aims at building a common software platform for both SHEMATSuite and ParFlow. This platform will be based on Kokkos, a software technology strongly pushed by the US DoE which holds in his heart performance portability.
 Fusion4energy:
GyselaA driftkinetic semiLagrangian 4D code for ion turbulence simulation. A new code is presented here, named Gyrokinetic SEmiLAgragian (GYSELA) code, which solves 4D driftkinetic equations for ion temperature gradient driven turbulence in a cylinder (r,θ,z). The code validation is performed with the slab ITG mode that only depends on the parallel velocity. This code uses a semiLagrangian numerical scheme, which exhibits good properties of energy conservation in nonlinear regime as well as an accurate description of fine spatial scales. The code has been validated in the linear and nonlinear regimes. The GYSELA code is found to be stable over long simulation times (more than 20 times the linear growth rate of the most unstable mode), including for cases with a high resolution mesh (δr∼0·1 Larmor radius, δz∼10 Larmor radius).
 Workflow Tools:
Melissarelies on an elastic and faulttolerant parallel client/server communication scheme, inheriting a threetier architecture from Melissa [37], .The server gathers background states from all ensemble members. New observations are assimilated into these background states to produce a new analysisstate for each member. These analysis states are distributed by the server totherunnersthat take care of progressing the ensemble members up to the nextassimilation cycle. Member to runner distribution is adapted dynamically bythe server according to the runner work loads, following a list scheduling algorithm. The runners and the server are parallel codes that can run with differentnumbers of processes. They exchange member states through N×M communication patterns for efficiency purpose.
 I/O Tools:
PDI(Portable Data Interface) data interface, a declarative API to decouple application codes from the Input / Output strategy to use. It present its plugin system, which supports the selection of the bestsuited existing IO library through a configuration file in each part of the code depending on the hardware available, the IO pattern, the problem size, etc.SIONlib(Portable Data Interface) is a scalable I/O library for the parallel access to tasklocal files. The library not only supports writing and reading binary data to or from several thousands of processors into a single or a small number of physical files but also provides for global open and close functions to access SIONlib file in parallel. SIONlib provides different interfaces: parallel access using MPI, OpenMp, or their combination and sequential access for postprocessing utilities. SIONlib library is enabled on SuperMUC and LINUX Cluster. It has been built for IBM MPI and Intel MPI.FTI(Fault Tolerance Interface) FTI was initially designed and developed in the context of a collaboration with Titech. FTI provides an API and a library for checkpoint/restart at application level. The API is at data structure level. The programmer decide which data to protect and when the protection should be performed. The library provides transparent multilevel checkpointrestart with 4 or 5 levels of checkpointing.IME(DDN’s Infinite Memory Engine) is a scaleout, softwaredefined, flash storage platform that streamlines the data path for application I/O. IME interfaces directly to applications and secures I/O via a data path that eliminates file system bottlenecks. With IME, architects can realize true flashcache economics with a storage architecture that separates capacity from performance.
 Solvers Tools:
PSBLASa library for parallel linear algebra computation on sparse matrices. PSBLAS enables easy, efficient, and portable implementations of parallel iterative solvers for linear systems. The interface keeps in view a Single Program Multiple Data programming model on distributedmemory machines.MLD2P4a package of parallel algebraic multilevel domain decomposition preconditioners in Fortran 95 MLD2P4 (MultiLevel Domain Decomposition Parallel Preconditioners Package based on PSBLAS) is a package of parallel algebraic multilevel preconditioners. It implements various versions of onelevel additive and of multilevel additive and hybrid Schwarz algorithms. In the multilevel case, a purely algebraic approach is applied to generate coarselevel corrections, so that no geometric background is needed concerning the matrix to be preconditioned. The matrix is assumed to be square, real or complex, with a symmetric sparsity pattern. MLD2P4 has been designed to provide scalable and easytouse preconditioners in the context of the PSBLAS (Parallel Sparse Basic Linear Algebra Subprograms) computational framework and can be used in conjuction with the Krylov solvers available in this framework. MLD2P4 enables the user to easily specify different aspects of a generic algebraic multilevel Schwarz preconditioner, thus allowing to search for the ”best” preconditioner for the problem at hand. The package has been designed employing objectoriented techniques, using Fortran 95, with interfaces to additional third party libraries such as UMFPACK, SuperLU and SuperLU Dist, that can be exploited in building multilevel preconditioners. Single and double precision implementations of MLD2P4 are available for both the real and the complex case, that can be used through a single interface. The parallel implementation is based on a Single Program Multiple Data (SPMD) paradigm for distributedmemory architectures; the interprocess data communication is based on MPI and is managed mainly through PSBLAS.MUMPS(“MUltifrontal Massively Parallel Solver”) is a package for solving systems of linear equations ofthe formAx=b, whereAis a square sparse matrix that can be either unsymmetric, symmetric positivedefinite, or general symmetric, on distributed memory computers.MUMPSimplements a direct methodbased on a multifrontal approach which performs a Gaussian factorizationAGMGsolves systems of linear equations with an aggregationbased algebraic multigrid method. It is expected to be efficient for large systems arising from the discretization of scalar second order elliptic PDEs. The method is however purely algebraic and may be tested on any problem. (No information has to be supplied besides the system matrix and the righthandside.) AGMG has been designed to be easy to use by non experts (in a black box fashion). It is available both as a software library for FORTRAN or C/C++ programs, and as a Octave/Matlab function. The Octave/Matlab version accepts real and complex matrices, whereas the FORTRAN/C/C++ library is available for double precision and double complex arithmetic. For this library, several level of parallelism are provided: multithreading (multicore acceleration of sequential programs), MPIbased, or hybrid mode (MPI+multithreading).Chameleonis a framework written in C which provides routines to solve dense general systems of linear equations, symmetric positive definite systems of linear equations and linear least squares problems, using LU, Cholesky, QR and LQ factorizations. Real arithmetic and complex arithmetic are supported in both single precision and double precision. It supports Linux and Mac OS/X machines (mainly tested on Intel x8664 and IBM Power architectures). Chameleon is based on the PLASMA source code but is not limited to sharedmemory environment and can exploit multiple GPUs. Chameleon is interfaced in a generic way with StarPU, PaRSEC, QUARK runtime systems. This feature allows to analyze in a unified framework how sequential taskbased algorithms behave regarding different runtime systems implementations. Using Chameleon with StarPU or PaRSEC runtime systems allows to exploit GPUs through kernels provided by cuBLAS and clusters of interconnected nodes with distributed memory (using MPI). Computation of very large systems with dense matrices on a cluster of nodes is still being experimented and stabilized. It is not expected to get stable performances with the current version using MPI.MAFIXnullPaSTiX(Parallel Sparse matriX package) is a scientific library that provides a high performance parallel solver for very large sparse linear systems based on direct methods. Numerical algorithms are implemented in single or double precision (real or complex) using LLt, LDLt and LU with static pivoting (for non symmetric matrices having a symmetric pattern). This solver provides also an adaptive blockwise iLU(k) factorization that can be used as a parallel preconditioner using approximated supernodes to build a coarser block structure of the incomplete factors.
ESiWACE
 The Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) enables global storm and eddy resolving weather and climate simulations on the upcoming (pre)Exascale supercomputers. The ESiWACE projects develop and support specific software packages concerning computing and storage aspects. In the following, more detailed information is listed.
NEMOstanding for “Nucleus for European Modelling of the Ocean” is a stateoftheart modelling framework for research activities and forecasting services in ocean and climate sciences, developed in a sustainable way by a European consortium. The NEMO ocean model has 3 major components:
 NEMOOCE models the ocean {thermo}dynamics and solves the primitive equations
 NEMOICE (SI3: SeaIce Integrated Initiative) models seaice {thermo}dynamics, brine inclusions and subgridscale thickness variations
 NEMOTOP (Tracers in the Ocean Paradigm) models the {on,off}line oceanic tracers transport and biogeochemical processes (using PISCES)
OASIS3MCTcoupler is a software allowing synchronized exchanges of coupling information between numerical codes representing different components of the climate system. Current OASIS developers are CERFACS (Toulouse, France) and Centre National de la Recherche Scientifique (Paris, France). OASIS3MCT, the version of the OASIS coupler interfaced with the Model Coupling Toolkit (MCT) from the Argonne National Laboratory, offers today a fully parallel implementation of coupling field regridding and exchange. Lowintrusiveness, portability and flexibility are OASIS3MCT key design concepts as for all previous OASIS versions. OASIS3MCT is a coupling library that needs to be linked to the component models, with the main function of interpolating and exchanging the coupling fields between these components. OASIS3MCT supports coupling of general twodimensional fields. Unstructured grids and 3D grids are also supported using a one dimension representation of the two or three dimensional structures.Cylca Python based workflow engine and metascheduler. The workflow engine Cylc (“silk”) manages a set of dependent tasks that need to run in a given order and deal with exceptions. It specialises in continuous workflows of cycling tasks such as those used in weather and climate forecasting and research (i.e. tasks can repeat at particular time intervals and can be triggered off the wall clock time as well as other tasks). Cylc is also easy to use with noncycling systems. Cylc was created by Hilary Oliver at NIWA. Its core team now includes Hilary as well as members in the Modelling Infrastructure Support Systems Team at the Met Office. Cylc was developed as a generic tool to help with the increased complexity of workflows used in the weather and climate communities and to replace the increasingly complex scriptbased solutions typically used within the community. It is currently being used for a very wide range of requirements from research to realtime operations including ensemble prediction systems. Cylc is also used for a wide variety of workflows such as generation of input data, assimilation of observational data, modelling post processing and commercial product generation.XIOSor XMLIOServer, is a library dedicated to I/O management in climate codes. XIOS manages output of diagnostics and other data produced by climate component codes into files and offers temporal and spatial postprocessing operations on this data. XIOS aims at simplifying the I/O management by minimizing the number of subroutines to be called and by supporting a maximum of online processing of the data.ICON(Icosahedral Nonhydrostatic Weather and Climate Model) modelling framework is a joint project between the German Weather Service and the Max Planck Institute for Meteorology for developing a unified nextgeneration global numerical weather prediction and climate modelling system. The ICON model has been introduced into DWD’s operational forecast system in January 2015. The ICON modeling system is developed jointly by the Max Planck Institute for Meteorology (MPIM) and the German Weather Service / DeutscherWetterdienst (DWD) to obtain a new model system with the following capabilities: Unified model for climate research and operational numerical weather prediction
 Common infrastructure for atmosphere and ocean models
 Consistent and conservative air and tracer transport
 Parameterization packages for scales from ~100 km for long term coupled climate simulations to ~1 km for cloud (atm.) or eddy (ocean) resolving regional simulations.
 Quasi uniform grid resolution with optional regional refinement. Currently 1way or 2way nesting can be used for atm. simulations.
 High scalability to run on largest German and European HPC machines
 Portability
IFS(The Integrated Forecast System) is a global numerical weather prediction system jointly developed and maintained by the European Centre for MediumRange Weather Forecasts (ECMWF) based in Reading, England, and MétéoFrance based in Toulouse.[1] The version of the IFS run at ECMWF is often referred to as the “ECMWF” or the “European model” in North America, to distinguish it from the American GFS. It comprises a spectral atmospheric model with a terrainfollowing vertical coordinate system coupled to a 4DVar data assimilation system. In 1997 the IFS became the first operational forecasting system to use 4DVar.[2] Both ECMWF and MétéoFrance use the IFS to make operational weather forecasts, but using a different configuration and resolution (the MétéoFrance configuration is referred to as ARPEGE). It is one of the predominant global mediumrange models in general use worldwide; its most prominent rivals in the 6–10 day medium range include the American Global Forecast System (GFS), the Canadian Global Environmental Multiscale Model (GEM and GDPS) and the UK Met Office Unified Model.Dynamicothe IPSL icosahedral dynamical core : status and outlook, is firstly a hydrostatic dynamical core designed for numerical consistency and scalability, and is being integrated into IPSLCM along the current production dynamical core, LMDZ5. In the last couple of years, it has been extended to fully compressible, non hydrostatic dynamics. Planned or under development are the support of fully unstructured meshes and limitedarea domains. I will outline the computational design of DYNAMICO and present some performance metrics of the dynamical core, standalone and embedded in IPSLCM.  Support to other models, e.g. UM from the UK Met Office
Excellerat
 Application codes are the core of Excellerat projects since they allow for achieving the cuttingedge results of engineering objectives.A number of codes are officially supported within the Services provided in the context of Excellerat Center of Excellence. The goal of EXCELLERAT is to enable the European engineering industry to advance towards Exascale technologies and to create a singleentry point to services and knowledge for all stakeholders of HPC for engineering. In order to achieve this goal, EXCELLERAT brings together key players from industry, research and HPC to provide all necessary services and software:
AlyaEmission prediction of internal combustion (IC) and gas turbine (GT) engines. Active flow control of aircraft aerodynamics including synthetic jet actuators. Coupled simulation of fluid and structure mechanics for fatigue and fracture.AVBPCombustion instabilities and emission prediction. Explosion in confined spaces.CODADesign process and simulation of full equipped aeroplanes. CFD coupling with computational structural mechanics including elastic effects.FEniCSis a popular opensource computing platform for solving PDEs. Adjoint optimization in external aerodynamics shape optimization.Nek5000a fast and scalable highorder solver for computational fluid dynamics. Wing with threedimensional wing tip. High fi delity simulation of rotating parts.TPLSopensource program for simulation of twophase flows. Flow modelling like oil and gas fl ows in longdistance pipelines or refi nery distillation columns, liquid cooling of microelectronic devices, carbon capture and cleaning processes, water treatment plants, blood fl ows in arteries, and enzyme interactions.PAAKAThighperformance computing insitu analysis tool.UQitPython toolbox for uncertainty quantification.Vistlesoftware environment that integrates simulations on supercomputers, post processing and parallel interactive visualization in immersive virtual environments.
HiDALGO
 The technology evolution in HiDALGO integrates scientific objectives into a platform, which documents the success of the individual project developements and generates the required impact to establish a sustainable Centre of Excellence. The aspects of the technology evolution can be divided into the following parts:
 – Seamless integration of HPC and HPDA technology.
– Increase application scalability by optimising or porting the involved kernels.
– Developement of the intelligent HiDALGO plattform: Intelligent workload management is a major asset of HiDALGO.
– Improve data management and analytics capabilities for HPC and HPDA environments.
They operate in the following areas:
Agentbased Modelling:AMOS(Agentbased Molecular Simulations), a system developed in order to estimate the motion of macromolecules consisted of aminoacids. AMoS is a multiagent system that imitates the macromolecule in a real world environment. The main goal is to represent every molecular atom with an intelligent software agent. In computer science, a software agent is a piece of software that acts on behalf of a user or other program. Such “action on behalf of” implies the ability to decide when (and if) an action is appropriate. The idea is that agents are not strictly invoked for a task, but activate themselves. Intelligent agents are characterised by autonomy, mobility, communication and interaction with each other. The user describes the system s/he desires to simulate and AMoS returns as output the estimated motion of the input macromolecule. AMoS is based on the deterministic and multidisciplinary method of simulation known as Molecular Dynamics (MD). MD is used in different kinds of systems with varying level of detail, ranging from quantum mechanics to molecular mechanics and gives the motion of macromolecules utilising known energy functionsforce fields [Lifson (1968), Warshel (1970)].RepastHPCA Platform for LargeScale AgentBased Modeling. It is an agentbased modeling and simulation toolkit based on the Message Passing Interface (MPI).MASONis a fast discreteevent multiagent simulation library core in Java, designed to be the foundation for large custompurpose Java simulations, and also to provide more than enough functionality for many lightweight simulation needs. MASON contains both a model library and an optional suite of visualization tools in 2D and 3D.
MASON is a joint effort between George Mason University’s Evolutionary Computation Laboratory and the GMU Center for Social Complexity, and was designed by Sean Luke, Gabriel Catalin Balan, Keith Sullivan, and Liviu Panait, with help from Claudio CioffiRevilla, Sean Paus, Keith Sullivan, Daniel Kuebrich, Joey Harrison, and Ankur Desai.SUMO(Simulation ofUrbanMObility). SUMO is an opensource program (licenced under GPL2) for traffic simulation. Its simulation model is microscopic,that is, each vehicle is explicitly modeled, has its own routeand moves individually over the network. It is mainly developed by Institute of Transportation Systems, located atGerman Aerospace Center. Among other features, it allows the existence of different types of vehicles, roads with several lanes, traffic lights,graphical interface to view the network and the entities thatare being simulated, and interoperability with other applications at runtime through an API called TraCI. Moreover,the tool is considered to be fast, still allowing a version without a graphical interface where the simulation is acceleratedputting aside visual concerns and overheads.. It is possible to pointout almost all specified features: vehicles stopped at thetraffic light as well as a long vehicle entering an intersection.This tool was crucial in this work! First, it allows loading different maps (described in XML files) in order to test various scenarios with vehicles and traffic lights. Then, with thesimulation itself there is no need to waste time implementingthe dynamics of many vehicles and traffic lights, starting soonwith the evaluation of algorithms. Finally, interoperability with other applications allows that each agent can be bound to an entity in SUMO, so that changes in the dynamics oftraffic lights, for instance, can be visually seen in the SUMO’sgraphic interface.FLEEis an agentbased modelling toolkit which is purposebuilt for simulating the movement of individuals across geographical locations. Flee is currently used primarily for modelling the movements of refugees and internally displaces persons (IDPs).
Flee is currently is released periodically under a BSD 3clause license.Apache Sparkis a unified engine designed for largescale distributed data processing, on premises in data centers or in the cloud. Spark provides inmemory storage for intermediate computations, making it much faster than Hadoop MapReduce. It incorporates libraries with composable APIs for machine learning (MLlib), SQL for interactive queries (Spark SQL), stream processing (Structured Streaming) for interacting with realtime data, and graph processing (GraphX).Apache Flinkis a distributed processing engine and a scalable data analytics framework. You can use Flink to process data streams at a large scale and to deliver realtime analytical insights about your processed data with your streaming application.
Flink is designed to run in all common cluster environments, perform computations at inmemory speed and at any scale. Furthermore, Flink provides communication, fault tolerance, and data distribution for distributed computations over data streams.
Flink applications process stream of events as unbounded or bounded data sets. Unbounded streams have no defined end and are continuously processed. Bounded streams have an exact start and end, and can be processed as a batch. In terms of time, Flink can process realtime data as it is generated and stored data in storage filesystems. In CSA, Flink is used for unbounded, realtime stream processing.Apache Stormis an opensource distributed realtime computational system for processing data streams. Similar to what Hadoop does for batch processing, Apache Storm does for unbounded streams of data in a reliable manner.
Apache Storm is able to process over a million jobs on a node in a fraction of a second.
It is integrated with Hadoop to harness higher throughputs.
It is easy to implement and can be integrated with any programming language.
Storm was developed by Nathan Marz as a back type project which was later acquired by Twitter in the year 2011. In the year 2013, Twitter made Storm public by putting it into GitHub. Storm then entered Apache Software Foundation in the same year as an incubator project, delivering highend applications. Since then, Apache Storm is fulfilling the requirements of Big Data Analytics.TensorFlowis an endtoend open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the stateoftheart in ML and developers easily build and deploy ML powered applications.Torchis a scientific computing framework with wide support for machine learning algorithms that puts GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.
A summary of core features: a powerful Ndimensional array
 lots of routines for indexing, slicing, transposing, …
 amazing interface to C, via LuaJIT
 linear algebra routines
 neural network, and energybased models
 numeric optimization routines
 Fast and efficient GPU support
 Embeddable, with ports to iOS and Android backends
COVISEstands for COllaborative VIsualization and Simulation Environment. It is an extendable distributed software environment to integrate simulations, postprocessing and visualization functionalities in a seamless manner. From the beginning COVISE was designed for collaborative working allowing engineers and scientists to spread on a network infrastructure.
In COVISE an application is divided into several processing steps, which are represented by COVISE modules. These modules, being implemented as separate processes, can be arbitrarily spread across different heterogeneous machine platforms. Major emphasis was put on the usage of high performance infrastructures such as parallel and vector computers and fast networks.
COVISE Rendering modules support Virtual environments ranging form workbenches over powerwalls, curved screens up to full domes or CAVEs. The users can thus analyze their datasets intuitively in a fully immersive environment through state of the art visualization techniques including Volume rendering and fast sphere rendering. Physical prototypes or experiments can be included into the analysis process through Augmented Reality techniques.FEniCSis a collection of software for automated solution of PDEs using FEM. FEniCS is a popular opensource (LGPLv3) computing platform for solving partial differential equations (PDEs). FEniCS enables users to quickly translate scientific models into efficient finite element code. With the highlevel Python and C++ interfaces to FEniCS, it is easy to get started, but FEniCS offers also powerful capabilities for more experienced programmers. FEniCS runs on a multitude of platforms ranging from laptops to highperformance clusters.
MAX
 Codes: The opensource MAX flagship codes implement stateoftheart algorithms for quantum mechanical materials simulations, based on Density Functional Theory and beyond. All the codes are now being ported to the heterogeneous (GPUbased) architectures:
Quantum ESPRESSOis an integrated suite of OpenSource computer codes for electronicstructure calculations and materials modeling at the nanoscale. It is based on densityfunctional theory, plane waves, and pseudopotentials. It is a suite for firstprinciples electronicstructure calculations and materials modeling, distributed for free and as free software under the GNU General Public License. It is based on densityfunctional theory, plane wave basis sets, and pseudopotentials (both normconserving and ultrasoft). ESPRESSO is an acronym for opEnSource Package for Research in Electronic Structure, Simulation, and Optimization.
The core plane wave DFT functions of QE are provided by the PWscf component, PWscf previously existed as an independent project. PWscf (PlaneWave SelfConsistent Field) is a set of programs for electronic structure calculations within density functional theory and density functional perturbation theory, using plane wave basis sets and pseudopotentials. The software is released under the GNU General Public License.
The latest version QE6.6 was released on 5 Aug 2020.SIESTAis a firstprinciples materials simulation program based on densityfunctional theory (DFT). It was one of the first codes to enable the treatment of large systems with firstprinciples electronicstructure methods, which opened up new research avenues in many disciplines.YAMBOis an opensource code that implements ManyBody Perturbation Theory (MBPT) methods (such as GW and BSE), which allows for accurate prediction of fundamental properties as band gaps of semiconductors, band alignments, defect quasiparticle energies, optics and outofequilibrium properties of materials, including nanostructured systems.FLEUR(Fullpotential Linearised augmented plane wave in EURope) is a code family for calculating groundstate as well as excitedstate properties of solids within the context of density functional theory (DFT). A key difference with respect to the other MAXcodes and indeed most other DFT codes lies in the treatment of all electrons on the same footing. Thereby we can also calculate the core states and investigate effects in which these states change.
FLEUR is based on the fullpotential linearised augmented plane wave method, a well established scheme often considered to provide the most accurate DFT results and used as a reference for other methods. The FLEUR family consists of several codes and modules: a versatile DFT code for the groundstate properties of multicomponent magnetic one, two and threedimensional solids. A focus of the code is on noncollinear magnetism, determination of exchange parameters, spinorbit related properties (topological and Chern insulators, Rashba and Dresselhaus effect, magnetic anisotropies, DzyaloshinskiiMoriya interaction).
The SPEX code implements manybody perturbation theory (MBPT) for the calculation of the electronic excitation properties of solids. It includes different levels of GW approaches to calculate the electronic selfenergy including a relativistic quasiparticle selfconsistent GW approach. The experimental KKRnano code, designed for highest parallel scaling, provides the possibility to utilize current supercomputers to their full extend and is applicable to densepacked crystals.CP2Kis a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. CP2K provides a general framework for different modeling methods such as DFT using the mixed Gaussian and plane waves approaches GPW and GAPW. Supported theory levels include – among the many – DFTB, LDA, RPA, semiempirical methods and classical force fields.BigDFTis an electronic structure pseudopotential code that employs Daubechies wavelets as a computational basis, designed for usage on massively parallel architectures. It features highprecision cubicscaling DFT functionalities enabling treatment of molecular, slablike as well as extended systems, and efficiently supports hardware accelerators such as GPUs since 2009.AiiDAis a Python materials informatics framework to manage, store, share, and disseminate the workload of highthroughput computational efforts, while providing an ecosystem for materials simulations where codes are automatically optimised on the relevant hardware platforms, and complex scientific workflows involving different codes and datasets can be seamlessly implemented and shared.  Libraries: The open source, domain specific libraries developed during MAX are specific to the materials modelling. The libraries, that are specific to the materials modelling domain, are:
CheSSOne of the most important tasks in electronic structure codes is the calculation of the density matrix. If not handled properly, this task can easily lead to a bottleneck that limits the performance of the code or even renders big calculations prohibitively expensive.
CheSS is a library that was designed with the goal of enabling electronic structure calculations for very big systems. It is capable of working with sparse matrices, which naturally arise if big systems are treated with a localized basis. Therefore, it is possible to calculate the density matrix with O(N), i.e., the computation cost only increases linearly with the system size.
The CheSS solver uses an expansion based on Chebyshev polynomials to calculate matrix functions (such as the density matrix or the inverse of the overlap matrix), thereby exploiting the sparsity of the input and output matrices. It works best for systems with a finite HOMOLUMO gap and a small spectral width. CheSS exhibits a twolevel parallelization using MPI and OpenMP and can scale to many thousands of cores. It has been converted into a standalone library starting from the original codebase within BigDFT. At the moment, it is coupled to the two MAX flagship codes BigDFT and SIESTA.
The performance of CheSS has been benchmarked against PEXSI and (Sca)LAPACK for the calculation of the density matrix and the inverse of the overlap matrix, respectively. CheSS is the most efficient method, as it is demonstrated with more details and performance figures in the publication “Efficient Computation of Sparse Matrix Functions for LargeScale Electronic Structure Calculations: The CheSS Library”.LAXLib & FFTXLibOne of the most important obstacles when keeping the codes up to date with hardware is the programming style based on oldstyle (i.e., non objectoriented) languages. Programming styles in community codes are often naive and lack the modularity and flexibility. From here, the need to disentangle such codes is essential for implementing new features or simply refactoring the application in order to efficiently run on the new architectures. Rewriting from scratch one of these codes is not an option because the communities behind these codes would be disrupted. One of the possible approaches that could permit to evolve the code is to progressively encapsulate the functions and subroutines, breaking up the main application in small (possibly weakly dependent) parts.
This strategy was followed by Quantum ESPRESSO: two main types of kernels were isolated in the independent directories and proposed as candidates for the domainspecific libraries for thirdparty applications.
The first library, called LAXlib, contains all the lowlevel linear algebra routines of Quantum ESPRESSO, and in particular those used by the Davidson solver (e.g., the Cannon algorithm for the matrixmatrix product). The LAXlib also contains a miniapp that permits to evaluate the features of a HPC interconnect measuring the linear algebra routines contained therein.
The second library encapsulates all the FFT related functions, including the drivers for several different architectures. The FFTXlib library is selfcontained and can be built without any dependencies on the remaining part of the Quantum ESPRESSO suite. Similarly, in the FFTXlib there is a miniapp that permits to mimic the FFT cycle for the SCF calculation of the charge density tuning the parallelization parameters of Quantum ESPRESSO. This miniapp has also been used to test the new implementation using the MPI3 non blocking collectives.SIRIUSis a domain specific library for electronic structure calculations. It implements pseudopotential plane wave (PPPW) and full potential linearized augmented plane wave (FPLAPW) methods and is designed for GPU acceleration of popular community codes such as Exciting, Elk and Quantum ESPRESSO. SIRIUS is used in production at CSCS to enable QE on GPUs. The library is opensource (BSD 2clause licence) and is freely available. SIRIUS is written in C++11 with MPI, OpenMP and CUDA/ROCm programming models. SIRIUS is organised as a collection of classes that abstract away the different building blocks of DFT selfconsistency cycle.COSMAis a parallel, highperformance, GPUaccelerated, matrixmatrix multiplication algorithm and library implementation that is communicationoptimal for all combinations of matrix dimensions, number of processors and memory sizes, without the need for any parameter tuning. COSMA is written in C++11 with MPI, OpenMP and CUDA/ROCm programming models. The library is opensource (BSD 3clause licence) and is freely available.SpFFTis a 3D FFT library for sparse frequency domain data written in C++ with support for MPI, OpenMP, CUDA, and ROCm. It was originally intended for transforms of data with spherical cutoff in frequency domain, as required by some computational materials science codes. For distributed computations, SpFFT uses a slab decomposition in space domain and pencil decomposition in frequency domain (all sparse data within a pencil must be on one rank). The library is opensource (BSD 3clause licence) and is freely available.  Features and algorithms: MAX codes have been strongly improved and partially renovated to fully exploit the potentialities offered by the forthcoming preexascale and exascale architecture.
 Workflows: AiiDA, the highthroughput environment, is built in a modular fashion that allows the support of any other simulation code via plugins.
NOMAD

NOMAD creates, collects, stores, and cleanses computational materials science data, computed by the most important materialsscience codes available today.
Furthermore, the NOMAD Laboratory develops tools for mining this data in order to find structure, correlations, and novel information that could not be discovered from studying smaller data sets. Thus, NOMAD fosters the discovery of new materials.And most importantly, NOMAD leads the Open Science Movement in materials science, supported by the global community by making all data freely accessible.
Artificial Intelligence Toolkitenables scientists and engineers to decide which materials are useful for specific applications or which new materials should be the focus of future studies. The Toolkit uses machine learning and artificial intelligence approaches to sort all available material data, to identify correlations and structures, and to detect trends and anomalies.
PerMedCoE
 The PerMedCoE Software Observatory is a benchmarking platform for the assessment of the quality and efficacy of celllevel software applied to different PerMed contexts, as well as new developments that emerge at the community.
This platform is a meeting point for the different communities: simulation tools developers, experts in HPC at the hardware and software level, and final users.
The key core applications are currently licensed under different free software and/or opensource licenses that enable their adaptation and scaleup (PhysiCell: The 3Clause BSD License, MaBoSS: GNU Lesser General Public License, some COBRA software: GNU General Public License 3, CellNOpt: GNU General Public License, version 2). The PerMedCoE will use the appropriate licenses to follow the CoE full commitment to open science and open software, based on a thorough licensing analysis, such as GPLv3/Apache/MIT. Furthermore, the PerMedCoE will consider that the commitment to code sharing extends to the use of collaborative platforms for code development and to the use of standard software development processes and practices, including formal code review and testing.COBRA(ConstraintBased Reconstruction and Analysis) is currently the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genomescale. Linear optimization computes steadystate flux solutions to ME models, but flux values are spread over many orders of magnitude. Data values also have greatly varying magnitudes. Standard doubleprecision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a nearoptimal warm start to be practical on large problems (current ME models have 70,000 constraints and variables and will grow larger).CellNOpt(from CellNetOptimizer; a.k.a. CNO) is a software used for creating logicbased models of signal transduction networks using different logic formalisms (Boolean, Fuzzy, or differential equations). CellNOpt uses information on signaling pathways encoded as a Prior Knowledge Network, and trains it against highthroughput biochemical data to create cellspecific models. CellNOpt is freely available under GPL license in R and Matlab languages. It can be also accessed through a python wrapper, and a Cytoscape plugin called CytoCopter provides a graphical user interface.MaBoSSis a simulation software for continuous time Boolean modeling. The model is described by a network, where nodes have Boolean states. Transitions between the node states are governed by logical equations, with an associated rate (a real number ∈[0,∞[).PhysiCellphysicsbased multicellular simulator—is an open source agentbased simulator that provides both the stage and the players for studying many interacting cells in dynamic tissue microenvironments. It builds upon a multisubstrate biotransport solver to link cell phenotype to multiple diffusing substrates and signaling factors. It includes biologicallydriven submodels for cell cycling, apoptosis, necrosis, solid and fluid volume changes, mechanics, and motility “out of the box.” The C++ code has minimal dependencies, making it simple to maintain and deploy across platforms. PhysiCell has been parallelized with OpenMP, and its performance scales linearly with the number of cells.
POP
 POP has experience with both discrete and continuum modelling software. POP also assessed a continuum modelling code, aiding with the performance improvement of the Urban Heat Island solver by Rheologic.
Code List:Extraeprofiling tool, developed by BSC, can very quickly produce very large trace files, which can take several minutes to load into Paraver, the tool used to view the traces. These trace files can be kept to a more manageable size by using Extrae’s API to turn the tracing on and off as needed. For example, the user might only want to record data for two or three time steps. This API was previously only available for codes developed in C, C++ and Fortran but now it also supports Python codes using MPI.Paraveris a very flexible data browser that is part of the CEPBATools toolkit. Its analysis power is based on two main pillars. First, its trace format has no semantics; extending the tool to support new performance data or new programming models requires no changes to the visualizer, just to capture such data in a Paraver trace. The second pillar is that the metrics are not hardwired on the tool but programmed. To compute them, the tool offers a large set of time functions, a filter module, and a mechanism to combine two time lines. This approach allows displaying a huge number of metrics with the available data. To capture the experts knowledge, any view or set of views can be saved as a Paraver configuration file. After that, recomputing the view with new data is as simple as loading the saved file. The tool has been demonstrated to be very useful for performance analysis studies, giving much more details about the applications behaviour than most performance tools. Some Paraver features are the support for: Detailed quantitative analysis of program performance
 Concurrent comparative analysis of several traces
 Customizable semantics of the visualized information
 Cooperative work, sharing views of the tracefile
 Building of derived metrics
Dimemasis a performance analysis tool for messagepassing programs. It enables the user to develop and tune parallel applications on a workstation, while providing an accurate prediction of their performance on the parallel target machine. The Dimemas simulator reconstructs the time behavior of a parallel application on a machine modeled by a set of performance parameters. Thus, performance experiments can be done easily. The supported target architecture classes include networks of workstations, single and clustered SMPs, distributed memory parallel computers, and even heterogeneous systems.
For communication, a linear performance model is used, but some nonlinear effects such as network conflicts are taken into account. The simulator allows specifying different task to node mappings.
Dimemas generates trace files that are suitable for Paraver enabling the user to conveniently examine any performance problems indicated by a simulator run.
The analysis module performs critical path analysis reporting the total CPU usage of different code blocks, as well as their importance for the program execution time. Based on a statistical evaluation of synthetically perturbed traces and architectural parameters, the importance of different performance parameters and the benefits of particular code optimizations can be analyzed.Scalascais a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks – in particular those concerning communication and synchronization – and offers guidance in exploring their causes.Cubewhich is used as performance report explorer for Scalasca and ScoreP, is a generic tool for displaying a multidimensional performance space consisting of the dimensions (i) performance metric, (ii) call path, and (iii) system resource. Each dimension can be represented as a tree, where nonleaf nodes of the tree can be collapsed or expanded to achieve the desired level of granularity. In addition, Cube can display multidimensional Cartesian process topologies.ScorePScalable Performance Measurement Infrastructure for Parallel Codes.
The ScoreP measurement infrastructure is a highly scalable and easytouse tool suite for profiling, event tracing, and online analysis of HPC applications.
It has been created in the German BMBF project SILC and the US DOE project PRIMA and will be maintained and enhanced in a number of followup projects such as LMAC, ScoreE, and HOPSA. ScoreP is developed under a BSD 3Clause License and governed by a meritocratic governance model.ExtraPis an automatic performancemodeling tool that supports the user in the identification of scalability bugs. A scalability bug is a part of the program whose scaling behavior is unintentionally poor, that is, much worse than expected. A performance model is a formula that expresses a performance metric of interest such as execution time or energy consumption as a function of one or more execution parameters such as the size of the input problem or the number of processors.Vampirprovides an easytouse framework that enables developers to quickly display and analyze arbitrary program behavior at any level of detail. The tool suite implements optimized event analysis algorithms and customizable displays that enable fast and interactive rendering of very complex performance monitoring data.
The combined handling and visualization of instrumented and sampled event traces generated by ScoreP enables an outstanding performance analysis capability of highlyparallel applications. Current developments also include the analysis of memory and I/O behavior that often impacts an application’s performance.TAUPerformance System® is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python.
TAU (Tuning and Analysis Utilities) is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements as well as eventbased sampling. All C++ language features are supported including templates and namespaces. The API also provides selection of profiling groups for organizing and controlling instrumentation. The instrumentation can be inserted in the source code using an automatic instrumentor tool based on the Program Database Toolkit (PDT), dynamically using DyninstAPI, at runtime in the Java Virtual Machine, or manually using the instrumentation API.
TAU’s profile visualization tool, paraprof, provides graphical displays of all the performance analysis results, in aggregate and single node/context/thread forms. The user can quickly identify sources of performance bottlenecks in the application using the graphical interface. In addition, TAU can generate event traces that can be displayed with the Vampir, Paraver or JumpShot trace visualization tools.MAQAO(Modular Assembly Quality Analyzer and Optimizer) is a performance analysis and optimization framework operating at binary level with a focus on core performance.
Its main goal of is to guide application developpers along the optimization process through synthetic reports and hints.
MAQAO mixes both dynamic and static analyses based on its ability to reconstruct high level structures such as functions and loops from an application binary.
Since MAQAO operates at binary level, it is agnostic with regard to the language used in the source code and does not require recompiling the application to perform analyses.
MAQAO has also been designed to concurrently support multiple architectures. Currently the Intel64 and Xeon Phi architectures are implemented.
The main modules of MAQAO are LProf, a samplingbased lightweight profiler offering results at both function and loop levels, CQA, a static analyser assessing the quality of the code generated by the compiler, and ONE View, a supervising module responsible for invoking the others and aggregating their results.
Other modules, currently in beta version, allow performing value profiling (VProf) and decremental analysis (DECAN).PyPOPpackage is designed to make it easy to perform application performance analyses based on the POP methodology. The primary goals of PyPOP are: Easy calculation of POP metrics.
 High quality figure generation.
 Easy access to underlying data and statistics (using Pandas).
 Flexible and extensible design.
SimGridis a framework for developing simulators of distributed applications that executed on distributed platforms, which can in turn be used to prototype, evaluate and compare relevant platform configurations, system designs, and algorithmic approaches. SimGrid provides ready to use models and APIs to simulate popular distributed computing platforms (commodity clusters, widearea and localarea networks, peers over DSL connections, data centers, etc.) As a result, SimGrid has served as the foundational technology for developing simulators and obtaining experimental results for a wide range of distributed computing domains: Grid computing, P2P computing, Cloud computing, Fog computing, Volunteer computing, HPC with MPI, MapReduce. SimGrid is accurate, scalable, and usable. Accurate: SimGrid’s simulation models have been theoretically and experimentally evaluated and validated.
 Scalable: SimGrid’s simulation models and their implementations are fast and have low memory footprint, making is possible to run SimGrid simulations quickly on a single machine.
 Usable: SimGrid is free software (LGPL license) available on Linux / Mac OS X / Windows, and allows users to write simulators in C++, C, Python, or Java.
Darshana scalable HPC I/O characterization tool. Darshan is designed to capture an accurate picture of application I/O behavior, including properties such as patterns of access within files, with minimum overhead. The name is taken from a Sanskrit word for “sight” or “vision”.
Darshan can be used to investigate and tune the I/O behavior of complex HPC applications. In addition, Darshan’s lightweight design makes it suitable for full time deployment for workload characterization of large systems. We hope that such studies will help the storage research community to better serve the needs of scientific computing.
Darshan was originally developed on the IBM Blue Gene series of computers deployed at the Argonne Leadership Computing Facility, but it is portable across a wide variety of platforms include the Cray XE6, Cray XC30, and Linux clusters. Darshan routinely instruments jobs using up to 786,432 compute cores on the Mira system at ALCF.MUSTsoftware consists of three individual packages:
1. PnMPI
2. GTI
3. MUST
PnMPI is responsible for the basic infrastructure and collecting data by intercepting all MPI calls of the target application. GTI provides the tool structure and the MUST package performs the correctness checking. All three packages are configured and built together using CMake and should only be used with the specific compiler and MPI library used in the process.
The two main use cases for MUST are during application development and porting of an application to a new system. MUST can single out new errors and those not manifesting in an application crash. It can also detect violations to the MPI standard on the target system.
MUST provides checks for the following classes of errors: Constants and integer values.
 Communicator usage.
 Datatype usage.
 Group usage.
 Operation usage.
 Request usage.
 Leak checks (MPI resources not freed before calling MPI_Finalize).
 Type mismatches.
 Overlapping buffers passed to MPI.
 Deadlocks resulting from MPI calls.
 Basic checks for thread level usage (MPI_Init_thread).
ARCHERsupports a diverse range of applications and simulation software. The major users of the system are materials scientists, climate scientists, physicists, engineers, and biosciences but we also support medical research, industrial simulations amongst many others.
The expertise of ARCHER staff plays a key role in allowing researchers to exploit the computing power available to gain the insights required and drive their investigations forward. This expertise is available every day through the ARCHER Helpdesk and huge range of ARCHER Training (both in person and online). ARCHER recognises that software development is fundamental to computational research and researchers can also apply for funding to support their software development through the eCSE Programme. This provides staff funding for software development projects (typically 6 months to 1 year), the technical staff can be provided within the research group of the applicant, from the ARCHER team, or even by external experts.  Computational Fluid Dynamics
SOWFAis a high fidelity numerical simulation tool for wind farmsdeveloped at NREL(U.S.A.). It could be used for model validations, controller validations, analysis and research of flows inside wind farms, evaluation of wind turbines of the wind farms, and for wind farm designs. The purpose of high fidelity simulation is to present reference data to assess engineering tools within the project.DROPSsoftware package is a CFD tool for simulating twophase flows.
It is being developed at the IGPM and the Institute of Scientific Computing at the RWTH Aachen. Part of the development was funded by the German Research Foundation (DFG) via the Collaborative Research Center SFB 540.
The code is still under development. The second release is published under the terms of the LGPL.Atelesis a Finite Element code which uses the Discontinuous Galerkin scheme on top of an distributed parallel octree mesh data structure. The code uses an explicit time integration scheme.The basic unit of the computational domain is called anelement. Static mesh refinement is usedto increase numerical accuracy in areas of interest. To lowest order, all elements should havethe same computational load. Therefore, for this analysis we will takethe number of elementsas the measure for problem size.Musubiis a solver open source for Lattice Boltzmann. It is part of the parallel simulation framework APES, which utilizes octrees to represent sparse meshes and provides tools from automatic mesh generation to postprocessing. The octree mesh representation enables the handling of arbitrarily complex simulation domains, even on massively parallel systems. Local grid refinement is implemented by several interpolation schemes in Musubi.OpenFOAM Solver(for Opensource Field Operation And Manipulation) is a C++ toolbox for the development of customized numerical solvers, and pre/postprocessing utilities for the solution of continuum mechanics problems, most prominently including computational fluid dynamics (CFD). OpenFOAM is a free, open source CFD software package that has a range of features for solving complex fluid flows involving chemical reactions, turbulence and heat transfer, and solid dynamics and electromagnetics.Nekbonesolves a standard Poisson equation using a conjugate gradient iteration with a simple or spectral element multigrid preconditioner on a block or linear geometry. It exposes the principal computational kernel to reveal the essential elements of the algorithmic architectural coupling that is pertinent to Nek5000.  Electronic Structure Calculations
ONETEP(OrderN Electronic Total Energy Package) is a linearscaling code for quantummechanical calculations based on densityfunctional theory.FHIAIMSis an efficient, accurate allelectron, fullpotential electronic structure code package for computational molecular and materials science (nonperiodic and periodic systems). The code supports DFT (semilocal and hybrid) and manybody perturbation theory. FHIaims is particularly efficient for molecular systems and nanostructures, while maintaining high numerical accuracy for all production tasks. Production calculations handle up to several thousand atoms and can efficiently use (ten) thousands of cores.
FHIaims is developed by an active, globally distributed community, including significant developments at FHI, Duke University, TU Munich, USTC Hefei, Aalto University, University of Luxembourg, TU Graz, Cardiff University and many other institutions.Quantum Espressois an integrated suite of OpenSource computer codes for electronicstructure calculations and materials modeling at the nanoscale. It is based on densityfunctional theory, plane waves, and pseudopotentials.SIESTAis both a method and its computer program implementation, to perform efficient electronic structure calculations and ab initio molecular dynamics simulations of molecules and solids. SIESTA’s efficiency stems from the use of a basis set of strictlylocalized atomic orbitals. A very important feature of the code is that its accuracy and cost can be tuned in a wide range, from quick exploratory calculations to highly accurate simulations matching the quality of other approaches, such as planewave methods.ADFapplication attempts to evenly distribute the work amongst the cores via communication between node masters (first rank of a node) and the global master (rank zero). Thisinforms the global master that a node needs to be assigned more work. The other cores in anode access their work via the POSIX shared arrays rather than using MPI communication.This is not captured automatically by the profiling tool and so source code instrumentationwas added around the calls in C code inscmfipc.cto create, destroy, lock and unlock theshared arrays. It is worth noting that this does not include any time the cores spent waiting fora lock to become available, just the time spent doing the locking or unlocking itself. Furtherinvestigation of the time spent waiting for locks on the shared arrays is outside the scope of thisPerformance Plan.BANDis density functional theory (DFT) code using atomic orbitals for periodic DFT calculations. POP partners generated the performance data used in this report to investigate strong scaling. Source code for ADF 2016.101 was compiled using Intel compilers 16.0.3 and Intel MPI 5.1.3. Trace data was collected using Extrae 3.3 and analysed using Paraver and associated tools. Data was also collected using ScoreP 3.0and analysed with Scalasca 2.3.1 and Cube. ScoreP filtering was utilised to exclude any short user function calls that do not appear on the callpath to an MPI function.DFTB(Density Functional Tight Binding) is a fast and efficient versatile quantum mechanical simulation package, containing almost all of the useful extensions which have been developed for the DFTB framework so far. It is a fast approximate methods to study large molecules and big periodic systems, employing DFTbased and semiempirical data. Using DFTB you can carry out quantum mechanical simulations like with abinitio density functional theory based packages, but in an approximate way gaining typically around two order of magnitude in speed.  Plasma Turbulence
GS2is a gyrokinetic flux tube initial value and eigenvalue solver for fusion plasmas
 General user help (installing, running, etc), as well as detailed articles for developers, is on the wiki.
 Source code is available for the current release candidate and the development trunk.
iPIC3Dis an implicit ParticleinCell code for Space Weather applications. iPIC3D simulates the interaction of Solar Wind and Solar Storms with the Earth’sMagnetosphere. The magnetosphere is a large system with many complex physical processes, requiring realistic domain sizes and billions of computational particles. In the PIC model, plasma particles from the solar wind are mimicked by computational particles. At each computational cycle, the velocity and location of each particle are updated by solving the equation of motion,the current and charge density are interpolated to the mesh grid, and Maxwell’s equations are solved.  Materials Science
QUIPpackage is a collection of software tools to carry out molecular dynamics simulations. It implements a variety of interatomic potentials and tight binding quantum mechanics, and is also able to call external packages, and serve as plugins to other software such as LAMMPS, CP2K and also the python framework ASE. Various hybrid combinations are also supported in the style of QM/MM, with a particular focus on materials systems such as metals and semiconductors.DPMGraGLeS2Dis a microstructure materials simulation code. The OpenMP parallel code is designed to run on large SMP machines in the RWTH compute cluster with 16sockets and up to 2 TB of memory. After a POP performance audit of the code, several performance issues were detected and a performance plan on how these issues could be resolved was set up.FIDIMAG(FInite DIfference atomistic and microMAGnetic solver) is a Python software package that allows users to define their magnetic system and choose from finitetemperature, atomistic simulations or continuous micromagnetic solvers. It uses the finite difference method andmeshes can be defined on cubic and hexagonal lattices. The computationally intensive parts ofthe code are implemented in the C language and uses the Sundials algebraic solvers and FFTW.The Python code is compiled to Cython to increase its performance. The parts of the code thathave been implemented in the C language have been parallelised using OpenMP. Parallelism isalso gained through the OpenMP implementation of FFTW which is built from source.GBmolDDis a molecular dynamics code for the simulation of coarsegrained molecular systems composed of isotropic and/or anisotropic particles. It uses the standard LennardJonespotential function to approximate the interaction between molecules using the LorentzBerthelot combining rule.kWaveis an open source acoustics toolbox for MATLAB andC++ developed by Bradley Treeby and Ben Cox (University College London) and Jiri Jaros(Brno University of Technology). The software is designed for time domain acoustic and ultrasound simulations in complex and tissuerealistic media. The simulation functions are based onthe kspace pseudospectral method and are both fast and easy to use.EPW(www.epw.org) is an ElectronPhonon Wannier code whichcalculates properties related to the electronphonon interaction using Density Functional Perturbation Theory and Maximally Localized Wannier Functions. It is distributed as part of theQuantum ESPRESSO suite.
 Earth Sciences
NEMO(Nucleus of the European Model of the Ocean)(Madec and the NEMO Team, 2008) is one of the most widely used ocean models, of great significance to the European climate community. The NEMO model is written in Fortran90 and its origins can be traced backover 20 years and, as such, the code base, parallelised using the message passing paradigm, has seen manycomputer architectures and programming environments come and go.SHEMATSuiteis a finitedifference opensource code for simulating coupled flow, heat and species transport in porous media. The code, written in Fortran95, originates from geoscientific research in the fields of geothermics and hydrogeology. It comprises: (1) a versatile handling of input and output, (2) a modular framework for subsurface parameter modeling, (3) a multilevel OpenMP parallelization, (4) parameter estimation and data assimilation by stochastic approaches (Monte Carlo, Ensemble Kalman filter) and by deterministic Bayesian approaches based on automatic differentiation for calculating exact (truncation errorfree) derivatives of the forward code.UKCAis a joint NCAS–Met Office programme funded by NCAS, GMR and DEFRA for three years. Project partners are the Hadley Centre and the Universities of Cambridge, Leeds and Reading. Why do we need UKCA?
 Scope of current chemistry and aerosol models is insufficient for Earth System modelling.
 NCAS expertise is required to improve climate models.
 To encourage wider community research in this area.
GITM(General Individuals Transport Model) code isan offline particle tracking model that requires velocity fields from a hydrodynamic model. Itincludes physical particle advection and diffusion, and biological development and behaviour.This method ensures that particles follow stream lines exactly. Furthermore, a random walkmethod with advective correction was included to simulate diffusion. The biological development and behaviour module of GITM allows particles to progress through a userdefined numberof egg and larval development stages. Each stage allows for separate formulations and settingsfor growth, mortality and vertical migration behaviour.The GITM code is an advection and diffusion code that reads its initial conditions from aNetCDF file and integrates the solution in time. During the time integration, the solution atvarious time steps is appended to a NetCDF file.  Neuroscience
OpenNN(Neural Designer) is an open source C++ code for neural networks that has OpenMP parallelisation.NEST5gis a simulator for spiking neural network models that focuses on the dynamics, size and structure of neural systems rather than on the exact morphology of individual neurons. The development of NEST is coordinated by the NEST Initiative.
NEST is ideal for networks of spiking neurons of any size, for example: Models of information processing e.g. in the visual or auditory cortex of mammals.
 Models of network activity dynamics, e.g. laminar cortical networks or balanced random networks.
 Models of learning and plasticity.
 Biomedicine
HemeLBis an opensource latticeBoltzmann code for simulation of largescale fluid flow in complexgeometries, e.g., intracranial blood vessels near aneurysms. It is written in C++ using MPI by UCLand developed within the EU H2020CompBioMedCentre of Excellence.