# Seminar Series

The Scientific Computing seminars usually take place on Thursday at 13:00 in the Visualisation Lab (MCS1022) or on Wednesday at 13:00 in D/MCS2050 (if joint with other research groups).

To sign up for the seminar mailing list (to receive updates or information on occasional virtual talks), please email [email protected].

## Upcoming seminars

- Friday, 8 March 2024, 13:00, David Keitel, University of the Balearic Island, MCS 2051

The event takes place in person in the VisLab of the MCS building, to attend online use the following zoom link: https://durhamuniversity.zoom.us/j/98751452277?pwd=ckpzVlc4TCtiQjZJWERwc2R1UGd6dz09

Meeting ID: 987 5145 2277, Passcode: 002464**Title**: TBA

## Past seminars

### Academic year 2023/2024

#### Michaelmas term

- Friday, 1 December 2023, 13:00, Filippo Spiga, NVIDIA – MCS2050 (joint with NESTiD)
**Title**:*The NVIDIA superchip (Grace-Grace and Grace-Hopper) platform: the ‘what’, the ‘how’, the ‘why’*

The purpose of this talk is to introduce the NVIDIA Grace CPU Superchip and NVIDIA Grace Hopper Superchip (CPU+GPU) platforms and how advancements in hardware coupled with NVIDIA’s vision on programming models for accelerated computing can have profound implications in developing next generation fast (time to solution) and efficient (energy to solution) HPC and AI codes.

This event is an in-person meeting in the Computer Science department. However, there will be Zoom option offered through the NESTiD seminar. - Thursday, 23 November 2023, 13:00, Philip Maybank, AMD
**MCMC for Bayesian Uncertainty Quantification from Time-Series Data***In computational neuroscience, Neural Population Models (NPMs) are mechanistic models that describe brain physiology in a range of different states. Within computational neuroscience there is growing interest in the inverse problem of inferring NPM parameters from recordings such as the EEG (Electroencephalogram). Uncertainty quantification is essential in this application area in order to infer the mechanistic effect of interventions such as anaesthesia.**This talk presents software for Bayesian uncertainty quantification in the parameters of NPMs from approximately stationary data using Markov Chain Monte Carlo (MCMC). Modern MCMC methods require first order (and in some cases higher order) derivatives of the posterior density. The software presented offers two distinct methods of evaluating derivatives: finite differences and exact derivatives obtained through Algorithmic Differentiation (AD). For AD, two different implementations are used: the open source Stan Math Library and the commercially licenced tool distributed by NAG (Numerical Algorithms Group). The use of derivative information in MCMC sampling is demonstrated through a simple example, the noise-driven harmonic oscillator. And different methods for computing derivatives are compared. The software is written in a modular object-oriented way such that it can be extended to derivative based MCMC for other scientific domains.*

The event takes place in person in the VisLab of the MCS building, to attend online use the following zoom link: https://durhamuniversity.zoom.us/j/94668043904?pwd=Y0czQjBKb2NTT2xtZkp5WlBxeTFOUT09

### Academic year 2022/2023

##### Summer seminars

- Wednesday, 19 July 2023, at 16:00, Lukas Krenz, Technical University of Munich

**Title**:*The Power of Oomph: A loud story about earthquakes, tsunamis, sound, and HPC*

**Abstract**: This talk explains the implementation of elastic-acoustic coupling in the open-source software SeisSol. We discuss two applications with real-world scenarios: First, we introduce the Palu, Sulawesi, 2018 earthquake-tsunami event and present a fully-coupled model that captures the complete event from dynamic earthquake rupture, to wave propagation in the Earth and the ocean, to tsunami propagation. Tsunami propagation is included by using a linearized boundary condition. As a second scenario, we discuss earthquakes induced by an enhanced geothermal system in the Helsinki metropolitan area. We model the largest of these earthquakes and the audible sound excited by it. Finally, we discuss how applying local time stepping (LTS) leads to efficient simulations. We investigate a novel implementation of LTS using state machines. We present strong-scaling results for the fully-coupled Palu scenario on the Mahti and Frontera supercomputers.

##### Easter term

- Wednesday, 21 June 2023, at 16:00, Bora Uçar, CNRS and ENS Lyon, France

**Title**:*On the Birkhoff–von Neumann decomposition*

**Abstract**: The Birkhoff–von Neumann decomposition expresses a doubly stochastic matrix as a convex combination of permutation matrices. This talk will be an introduction to this decomposition. We will cover algorithms, combinatorial problems, and some open problems.

This talk contains results from joint work with Michele Benzi (Scuola Normale Superiore, Pisa, Italy), Jérémy E. Cohen (CNRS, Lyon), Fanny Dufosse (Inria, France), Kamer Kaya (Sabanci Univ, Turkey), and Ioannis Panagiotas (LIP6, Sorbonne Univ., France).

- Wednesday, 7 June 2023, at 16:00, David Silvester, The University of Manchester

**Title**:*Fast solution of incompressible flow problems with two-level pressure approximation*

**Abstract**: Reliable and efficient iterative solvers for models of steady incompressible flow emerged in the early 1990s. Strategies based on block preconditioning of the underlying matrix operators using (algebraic or geometric) multigrid components have proved to be the key to realising mesh independent convergence (and optimal complexity) without the need for tuning parameters, particularly in the context of classical mixed finite element approximation. The focus of this contribution is on efficient solver strategies in cases where (an inf–sup) stable Taylor–Hood mixed approximation is augmented by a piecewise constant pressure in order to guarantee local conservation of mass. The augmentation leads to over-specification of the pressure solution requiring a redesign of the established solver technology.

This enrichment process causes over-specification of the pressure, which complicates the design and implementation of efficient solvers for the resulting linear systems. We first describe the impact of this choice of pressure space on the matrices involved. Next, we show how to recover effective solvers for Stokes problems, using a preconditioner based on the singular pressure mass matrix, and for Oseen systems arising from linearising the Navier–Stokes equations, by using a two-stage pressure convection–diffusion strategy.

This is joint work with Jennifer Pestana.

- Friday, 19 May 2023, at 14:00, Dan Stanzione, Texas Advanced Computing Center (TACC)

**Part of the Durham HPC Days – Spring 2023.**

Unusual time: Friday at 14:00

Unusual venue: Scott Logic Lecture Theatre (MCS0001).

Unusual online access: Zoom link for the HPC Days – Spring 2023

**Title**:*What’s going on in research computing and AI in the US and Texas*

**Short bio**: Dr. Dan Stanzione, Associate Vice President for Research at The University of Texas at Austin since 2018 and Executive Director of the Texas Advanced Computing Center (TACC) since 2014, is a nationally recognized leader in high performance computing. He serves on the National Artificial Intelligence Research Resource Task Force, formed by the National Science Foundation (NSF) and the White House Office of Science and Technology Policy (OSTP). He is the principal investigator (PI) for an NSF grant to deploy Frontera, the fastest supercomputer at any U.S. university. Stanzione is also the PI of TACC’s Stampede2 and Wrangler systems, supercomputers for high performance computing and for data-focused applications, respectively. For six years he was co-PI of CyVerse, a large-scale NSF life sciences cyberinfrastructure. Stanzione was also a co-PI for TACC’s Ranger and Lonestar supercomputers, large-scale NSF systems previously deployed at UT Austin. Stanzione received his bachelor’s degree in electrical engineering and his master’s degree and doctorate in computer engineering from Clemson University.

- Friday, 19 May 2023, at 13:00, David Keyes, King Abdullah University of Science and Technology

**Part of the Durham HPC Days – Spring 2023.**

Unusual time: Friday at 13:00

Unusual venue: Scott Logic Lecture Theatre (MCS0001).

Unusual online access: Zoom link for the HPC Days – Spring 2023

**Title**:*Efficient computation through tuned approximation*

**Abstract**: Numerical linear algebra software is being reinvented to provide opportunities to tune dynamically the accuracy of computation to the requirements of the application, resulting in savings of memory, time, and energy. Floating point computation in science and engineering has a history of “oversolving” relative to expectations for many models. So often are real datatypes defaulted to double precision that GPUs did not gain wide acceptance until they provided in hardware operations not required in their original domain of graphics. Indeed, the condition number of discretizations of the Laplacian reaches the reciprocal of unit roundoff for single precision with just a thousand uniformly spaced points per dimension. However, many operations considered at a blockwise level allow for lower precision and many blocks can be approximated with low rank near equivalents. This leads to smaller memory footprint, which implies higher residency on memory hierarchies, leading in turn to less time and energy spent on data copying, which may even dwarf the savings from fewer and cheaper flops. We provide examples from several application domains, including a review of a 2022 Gordon Bell finalist computation that benefits from both blockwise lower precisions and lower ranks.

**Short bio**: David Keyes directs the Extreme Computing Research Center at the King Abdullah University of Science and Technology (KAUST), where he was a founding Dean in 2009 and currently serves in the Office of the President as Senior Associate. He is a professor in the programs of Applied Mathematics, Computer Science, and Mechanical Engineering. He is also an Adjunct Professor of Applied Mathematics and Applied Physics at Columbia University, where he formerly held the Fu Foundation Chair. He works at the interface between parallel computing and PDEs and statistics, with a focus on scalable algorithms that exploit data sparsity. Before joining KAUST, Keyes led multi-institutional scalable solver software projects in the SciDAC and ASCI programs of the US Department of Energy (DoE), ran university collaboration programs at US DoE and NASA institutes, and taught at Columbia, Old Dominion, and Yale Universities. He is a Fellow of SIAM, the AMS, and the AAAS. He has been awarded the Gordon Bell Prize from the ACM, the Sidney Fernbach Award from the IEEE Computer Society, and the SIAM Prize for Distinguished Service to the Profession. He earned a B.S.E. in Aerospace and Mechanical Sciences from Princeton in 1978 and a Ph.D. in Applied Mathematics from Harvard in 1984.

- Thursday, 18 May 2023, at 13:00, Emma Barnes, University of York

**Part of the Durham HPC Days – Spring 2023.**

Unusual time: Thursday at 13:00

Unusual venue: Scott Logic Lecture Theatre (MCS0001).

Unusual online access: Zoom link for the HPC Days – Spring 2023

**Title**:*Sustainable accessible research IT*

**Short bio**: Emma Barnes is Head of Research IT at the University of York. Emma has spent the last 8 years building the research IT offering at the University. Emma project managed the first major cluster offering at the University (Viking). The £2.5 million project offers researchers and academics free access to the technology, and has been a huge success with users from a range of disciplines and backgrounds. We are now working on its replacement with a bigger focus on sustainability. The research IT team has also recently established a Research Software Engineering group and her focus is now on building up the infrastructure team where we can promote career development, training and peer support. The team’s other focus is accessibility, either through educating users or embracing new technologies. Where we are now focusing on efforts to support non- traditional HPC users and where appropriate, implementing new technologies to enhance research and teaching. Emma received her MPhys in Physics with Astrophysics at the University of York, then completed her PhD in Astroparticle physics at the University of Edinburgh. Edinburgh was where Emma became a programming and Linux enthusiast, which continued throughout her Postdoctoral work in Boston University US in Particle physics. Emma later switched careers to a more computing focus and can now use her passion for research IT to benefit research throughout the university.

- Wednesday, 17 May 2023, at 16:00,
François Mazen, Kitware

**Part of the Durham HPC Days – Spring 2023.**

Unusual venue: Scott Logic Lecture Theatre (MCS0001).

**Title**:*Large-scale to exascale data exploration and visualization with ParaView*

**Abstract**: After a decade of announcement, exascale computing is now a reality with the recent Frontier supercomputer starting up. Kitware has been deeply involved in the Exascale Computing Project (ECP) to participate in the development of new tools tailored for this major milestone. During this presentation, we will understand the challenges that large-scale simulations are facing regarding the exploration and visualization of their output, and how open-source tools like ParaView, Catalyst 2, VTK-m, ADIOS2, AMReX… would help to tackle these challenges.

**Short bio**: François Mazen is the Assistant Director for the Scientific Visualization team at Kitware Europe. In 2008, François received his engineering degree at IFMA (French Institute for Advanced Mechanics) in Clermont-Ferrand where he was nominated for TOP 10 students. The same year, François also received a Master of Science at Université Blaise Pascal (Clermont-Ferrand) where he specialized in Rigid Body. His previous 13 years of experience, included 4 years at Ansys where he worked as a software developer in the Funded Development team, and more recently 6 years at Siemens PLM as Project Leader where he mainly worked on the design, architecture and development of Robot’s Path Planning Technology in C++ ans C#. With an extensive knowledge of project management, C++ development and visualization François strengthen KEU’s Scientific Visualization team proficiency.

- Wednesday, 10 May 2023, at 16:00,
Garth Wells, University of Cambridge

**Title**:*Finite element methods at exascale*

**Abstract**: I will discuss the development of finite element algorithms and implementations for a range of applications on exascale hardware. When considering accelerators it is helpful to reflect on past attempts (since the honest efforts generally failed!) to assess why performance was disappointing. I will argue that the disappointing performance was due to a failure to assess the suitability of algorithms end-to-end; focusing on accelerating one step in a sequence of otherwise established of algorithms already over-constrained approaches and doomed them to failure. Improved mathematical and algorithmic understanding now allows us to exploit exascale-type hardware efficiently. Also on the upside, I will show that recent developments in compiler technologies have made it much easier to develop high performance finite element kernels without non-standard extensions, with measured performance compared against performance models. To make developments accessible, I will also touch upon the open-source FEniCS Project (https://fenicsproject.org) and its approach to high-level implementations with an assessment of the advantages and disadvantages of domain-specific languages and code generation for scientific computing. Code generation is no panacea! Finally, some recent performance data on the pre-exascale LUMI system will be presented.

- Wednesday, 26 April 2023, at 16:00,
Benedict Rogers, The University of Manchester

**Title**:*Massive parallelisation of strictly incompressible flows using smoothed particle hydrodynamics*

**Abstract**: The meshless method, smoothed particle hydrodynamics (SPH) is becoming increasingly used in engineering industry for a range of applications such as aerospace, automotive, nuclear, chemical, offshore, marine, hydraulic and coastal engineering. The SPH method is now becoming competitive against well-established simulation techniques. Real problems are 3-D and require a massive number of particles and hardware acceleration, while the fluid itself can be considered a strictly incompressible. In this talk, we will consider how we have developed a massively parallel strictly incompressible SPH solver, the challenges involved, and how are now approaching this for the era of exascale computing.

**Short bio**: Prof. Benedict D. Rogers is Chair of Computational Hydrodynamics and leads the Smoothed Particle Hydrodynamics (SPH) specialist group in the School of Engineering at the University of Manchester (UoM). He is a founder of the international organisation for SPH – the SPH rEsearch and engineeRing International Community (SPHERIC), and acted as its Chair (2015-2021). He is a core developer of the open-source code DualSPHysics, an international collaboration across 4 countries including the University of Parma and has been downloaded 100,000+ times to date. He has published over 70 peer-reviewed journal papers (H-index: 38) and is a co-Investigator on 2 projects preparing SPH for exascale computing. He has been awarded the Thomas Telford Premium Award by the Institution of Civil Engineers (ICE) twice – in 2014 and 2016 for work on SPH modelling of tsunami-structure interaction and in 2022 jointly received the prestigious International Joe Monaghan Prize for progress made addressing the Grand Challenges of SPH.

##### Epiphany term

- Thursday, 16 February 2023, at 15:00, Richard Graham, NVIDIA

**Part of the DPU Hackaton 2023.**

**Title**:*NVIDIA’s BlueField DPU: offloading applications to the network*

**Abstract**: Plateauing of the capabilities of individual system components has led to new innovations in system design to meet growing computational needs. One such innovation is NVIDIA’s family of Data Processing Units (DPUs) which provides a network offload engine that include a Network Interface core, programmable CPU cores and targeted acceleration engines. This is known as the BlueField family of network adapters. This presentation will provide an overview of the BlueField network devices, principles behind making effective use of such devices and present work done using these devices to accelerate collective communication operations.

- Wednesday, 8 February 2023, at 16:00, Sylvain Laizet, Imperial College London

**Title**:*Large scale computational fluid dynamics simulations using supercomputers*

**Abstract**: Simulating and understanding turbulent fluid flows, characterised by the irregular motions of fluid particles with swirls and vortices, is one of the most challenging problems in science. The design of many engineering and industrial systems as well as the prediction of their impact on the environment greatly relies on the turbulent behaviour of fluid flows being properly quantified. Significant progress has been made recently using high performance computing (HPC), and computational fluid dynamics (CFD, the process of mathematically modelling fluid flows) is now a critical complement to experiments and theories. In this talk, I will introduce Xcompact3d , an open-source framework dedicated to the study of turbulent fluid flows on HPC systems. It is a tool which can be used off the shelf to obtain high-fidelity simulations of turbulent fluid flows. Its unique features include its accuracy, efficiency, scalability, independence from external libraries (except an MPI library), user-friendliness and its customisability. Various applications will be presented to highlight the potential of Xcompact3d.

- Wednesday, 25 January 2023, at 16:00,
Jemma Shipton, University of Exeter

**Title**:*Parallel timestepping algorithms for geophysical fluid dynamics*

**Abstract**: Following exciting developments in both mathematical analysis and practical experience, time-parallel methods are undergoing a revival as a potentially powerful route to exploiting future massively parallel exascale supercomputers. Time-parallel methods address the question of what to do when one has reached the limits of strong scaling (decreasing wallclock time by increasing the number of processors working in parallel) through domain decomposition parallelisation in space. A key lesson from the recent literature is that the success of parallel-in-time algorithms critically depends on them being carefully adapted to the equation being solved. Much like regular timestepping methods, there are many parallel-in-time algorithms, and the right algorithm needs to be designed and selected according to the mathematical properties and applications requirements of the underlying system. Here I will present an overview of several parallel-in-time algorithms that are relevant for weather and climate prediction, illustrating the theory in the context of ordinary differential equations and outlining future plans for applying them to geophysical flows.

##### Michaelmas term

- Wednesday, 23 Nov 2022, at 16:00,
**Rod Burns**, Codeplay Software &**Andrew Mallison**, Intel

**Title**:*The SYCL GPU vision*

This talk is a follow-up to the group’s oneAPI workshop. While the workshop provides a platform for SYCL users, this talk targets a broader audience—those who might want to learn about SYCL from a high-level perspective. For in-person participants from Durham, the speakers will be available after the talk for 1:1 conversations around SYCL. These will be convered by an NDA.

**Abstract**: The following will be covered:

- oneAPI and SYCL and the heterogeneous landscape;
- oneAPI on Intel;
- oneAPI on Nvidia and AMD;
- migrating CUDA to SYCL; and
- oneAPI Community Forum collaboration.

- Wednesday, 16 Nov 2022, at 16:00, Jonas Latz, Heriot-Watt University

**Title**:*Gradient flows and randomised thresholding: sparse inversion and classification*

**Abstract**: Sparse inversion and classification problems are ubiquitous in modern data science and imaging. They are often formulated as non-smooth minimisation problems. In sparse inversion, we minimise, e.g., the sum of a data fidelity term and an L1/LASSO regulariser. In classification, we consider, e.g., the sum of a data fidelity term and a non-smooth Ginzburg–Landau energy. Standard (sub)gradient descent methods have shown to be inefficient when approaching such problems. Splitting techniques are much more useful: here, the target function is partitioned into a sum of two subtarget functions—each of which can be efficiently optimised. Splitting proceeds by performing optimisation steps alternately with respect to each of the two subtarget functions. In this work, we study splitting from a stochastic continuous-time perspective. Indeed, we define a differential inclusion that follows one of the two subtarget function’s negative subdifferential at each point in time. The choice of the subtarget function is controlled by a binary continuous-time Markov process. The resulting dynamical system is a stochastic approximation of the underlying subgradient flow. We investigate this stochastic approximation for an L1-regularised sparse inversion flow and for a discrete Allen–Cahn equation minimising a Ginzburg–Landau energy. In both cases, we study the longtime behaviour of the stochastic dynamical system and its ability to approximate the underlying subgradient flow at any accuracy. We illustrate our theoretical findings in a simple sparse estimation problem and also in low- and high-dimensional classification problems.

- Wednesday, 2 Nov 2022, at 16:00, Ana Lucia Varbanescu, University of Twente (and University of Amsterdam)

**Title**:*Towards zero-waste computing*

**Abstract**: “Computation” has become a massive part of our daily lives; even more so, in science, a lot of experiments and analysis rely on massive computation. Under the assumption that computation is cheap, and time-to-result is the only relevant metric for all of us, we currently

use computational resources at record-low efficiency.

In this talk, I argue this approach is an unacceptable waste of computing resources. I further define the goal of zero-waste computing and discuss how performance engineering methods and techniques can

facilitate this goal. By means of several case-studies, I will also demonstrate performance engineering at work, proving how efficiency and time-to-result can co-exist.

**Short bio**: Ana Lucia Varbanescu holds a BSc and MSc degree from POLITEHNICA University in Bucharest, Romania. She obtained her PhD from TUDelft, The Netherlands, and continued to work as a PostDoc researcher in The Netherlands, at TUDelft and VU University in Amsterdam. She is a MacGillavry fellow at University of Amsterdam, where she was tenured in 2018 as Associate Professor. Since 2022, she is also Professor at University of Twente. She has been a visiting researcher at IBM TJ Watson (2006, 2007), Barcelona Supercomputing Center (2007), NVIDIA (2009), and Imperial College of London (2013). She has received several NWO grants (including a Veni grant) and she is co-PI for the GraphMassivizer EU project.

Ana’s research stems from HPC, and investigates the use of multi- and many-core architectures for HPC, with a special focus on performance and energy efficiency modeling for both scientific and irregular, data-intensive applications.

A recording is available here.

- Wednesday, 26 Oct 2022, at 16:00, Katie Schuman, University of Tennessee

**Title**:*Opportunities for neuromorphic computing co-processors*

**Abstract**: Neuromorphic computing is a popular technology for the future of computing. Much of the focus in neuromorphic computing research and development has focused on new architectures, devices, and materials, rather than in the software, algorithms, and applications of these systems. In this talk, I will overview the field of neuromorphic computing with a particular focus on challenges and opportunities in using neuromorphic computers as co-processors. I will discuss neuromorphic applications for both machine learning and non-machine learning use cases.

- Wednesday, 12 Oct 2022, at 16:00, Nils Wedi, ECMWF

**Title**:*Destination Earth – digital twins of the Earth system*

**Abstract**: This talk will describe advances in the field of numerical weather prediction (NWP) and climate, culminating in the ongoing efforts to create digital replicas of the Earth system implemented by the European Commission’s Destination Earth initiative. Digital Twins of Earth encapsulate both the latest science and technology advances to provide near-real time information on extremes and climate change adaptation in a wider digital environment, where users can interact, modify and ultimately create their own tailored information.

Recent work has demonstrated that global, coupled storm-resolving (or km-scale) simulations are feasible and can contribute to building such information systems and are no longer a dream thanks to recent advances in Earth system modelling, supercomputing and the ongoing adaptation of weather and climate codes for accelerators. Such simulations start to explicitly represent essential climate processes, e.g. detailed inland water and land-use representation, deep convection and mesoscale ocean eddies, that today need to be fully parametrised even at the highest resolution used in global weather and climate information production. These simulation outputs, combined with novel, data-driven deep learning advances, thus offer a window into the future, with a promise to significantly increase the realism and timeliness of delivery of Earth system information to a broad range of users. The significant compute and data challenges are discussed.

- Wednesday, 5 Oct 2022, at 16:00, Alex Titterton, Graphcore

**Title**:*Accelerating HPC workloads using AI and Graphcore’s IPU*

**Abstract**: For many years, researchers have been solving the world’s most complex scientific problems by undertaking traditional HPC techniques across a wide range of applications. Due to the growing complexity of calculations, operational costs, and the need to accelerate classical processes, an entirely new type of architecture is required. In this talk, we will provide a technical introduction to Graphcore’s Intelligence Processing Unit (IPU) and learn how traditional HPC workloads can be enhanced and accelerated using AI techniques running on the IPU. We’ll also explore how innovators are adopting this new approach across drug discovery, weather forecasting, climate modelling and computational fluid dynamics.**Speaker bio**: Alex Titterton has a PhD in particle physics, jointly awarded by the Universities of Bristol and Southampton, and has worked at CERN and the Rutherford Appleton Laboratory. Currently Alex works as Field Applications Engineer at Graphcore, where he has been enjoying tackling new challenges supporting Graphcore customers.

### Academic year 2021/2022

- Wednesday, 18 May 2022, at 15:00, Fabian Knorr, University of Innsbruck,

Title: Floating-point compressors**Abstract:**Storing and exchanging large amounts of floating-point data is common in distributed scientific computing applications. Data compression, when fast enough, can speed up such workloads by reducing contention on interconnects and storage systems. This talk explores two classes of floating-point compressors, the lossy and lossless type, and discusses their their utility on modern parallel and accelerator-based systems. We show which approach is best suited for what problem formulation and take a close look at ndzip, a lossless compressor for dense multi-dimensional data that is specifically engineered to achieve maximum efficiency on GPU-accelerated hardware.The Slides for this event are available here.

- Wednesday, 11 May 2022, at 15:00, Spencer Sherwin, Imperial College London, Title: Industry-Relevant implicit Large-Eddy Simulation of flows past automative and racing cars using Spectral/hp Element Methods
**Abstract**: We present the successful deployment of high-fidelity Large-Eddy Simulation (LES) technologies based on spectral/hp element methods to industrial flow problems that are characterized by high Reynolds numbers and complex geometries. In particular, we describe the steps required to perform the implicit LES of a realistic automotive and racing cars. Notable developments had to be made in order to overcome obstacles in both mesh generation and solver technologies to simulate these flows, and will be outlined in this presentation. We thereby hope to demonstrate a viable pathway to translate academic developments into industrial tools, that can advance the analysis and design capabilities of high-end engineering users.

- Wednesday, 4 May 2022, at 15:00, Joshua Short, Boston Limited, Title: NVIDIA and The Emergence of the Virtual World – Exploring The Omniverse, Digital Twins, and Cloud XR
**Abstract:**With virtual reality becoming more integrated into the fabric of human existence, it’s evident that virtual existence is evolving and altering the social elements of life in new ways. But what benefits can virtual reality present to the world on a research or commercial level?

We will be discussing how the Omniverse, Digital Twins, and Cloud XR technologies are being used in a variety of use cases, and how they can positively impact the future of not just virtual reality, but the physical world too.The Slides for this event are available here and here. A recording is available here.

- Wednesday, 27 Apr 2022, at 15:00, Carola Kruse, CERFACS, Title: On the efficient solution of saddle point systems with an inner-outer iterative solver based on the Golub-Kahan bidiagonalization
**Abstract**: Symmetric indefinite linear systems of saddle point type arise in a large variety of applications, for example in fluid dynamics or structural mechanics. In this talk, we will review an inner-outer iterative solver for saddle point problems that is based on the Golub-Kahan bidiagonalization. The difficulty in the proposed solver is that in each outer loop an inner iterative system M, say, of size of the (1,1)-block has to be solved. If M arises from the discretization of a partial differential equation, efficient solvers might be available. Here, we will focus on the Stokes equation as a test problem and present different strategies for reducing the overall number of inner iterations and the computation time.

- Wednesday, 23 Mar 2022, at 15:00, Nils Deppe, California Institute of Technology & Lawrence Kidder, Cornell University, Title: SpECTRE: A task-based framework for astrophysics
**Abstract**: Astrophysical phenomena vary greatly in spatial and temporal scales while also requiring complicated physics like neutrino transport. SpECTRE is a next-generation open-source (github.com/sxs-collaboration/spectre/) code designed with modern algorithms and computer science practices in order to take advantage of exascale supercomputers. We will discuss how SpECTRE uses the open-source task-based parallelization framework Charm++ (github.com/UIUC-PPL/charm/) to realize task-based parallelism. We will provide details on how our discontinuous Galerkin-finite-difference hybrid and elliptic solver algorithms are translated into the language of task-based parallelism, including the use of SIMD intrinsics and lazy evaluation. Time permitting, we will also provide an overview of our tensor library that allows scientists to write equations in a domain specific language, that of general relativity/gravity.

- Wednesday, 9 Mar 2022, at 15:00, Rosa Badia, Barcelona Supercomputing Center, Title: Parallel programming with PyCOMPSs and the dislib ML library
**Abstract:**The seminar will present our group research on parallel programming models, more specifically in PyCOMPSs. PyCOMPSs is a task-based programming model for distributed computing platforms. In the seminar we will present the basics of the programming model and of its runtime. The seminar will also include some of our recent research work in the parallelization of machine learning with the dislib library. Dislib is a distributed, parallel machine learning library that offers a syntax inspired in scikit learn and it is parallelized with PyCOMPSs.

- Wednesday, 26 Jan 2022, at 15:00, Linus Seelinger, Heidelberg University, Title:
*Bridging the gap: Advanced uncertainty quantification and challenging models***Abstract:**Simulations of complex real-world processes, often by means of partial differential equations, lead to computational challenges. There is a rich ecosystem of methods and software packages to address these. The treatment of uncertainties in these models further increases problem dimensionality. However, the software ecosystem in the field of uncertainty quantification (UQ) is far less mature.

This talk addresses the resulting gap between advanced UQ methods and challenging models in three ways:

– An introduction to the MIT Uncertainty Quantification library (MUQ) is given. MUQ provides a modular framework for building UQ applications, offering numerous existing methods and reusable components.

– A new universal interface for coupling UQ and model software is presented. Based on a simple HTTP protocol, it fully decouples development on both sides, while containerization provides portability.

– A parallelized multilevel UQ method for high performance applications developed in MUQ is demonstrated at the example of a large-scale tsunami model.

- Wednesday, 12 Jan 2022, at 15:00, Johannes Doerfert, Argonne National Laboratory, Title:
*O*penMP in LLVM — Behind the PragmasAbstract

**:**OpenMP in LLVM is more than the directive handling in the frontends Clang and Flang. LLVM ships with various OpenMP specific compiler optimizations for a while now, and more are to come. There is a myriad of OpenMP runtimes to orchestrate portable accelerator offloading behind the scenes and to provide improved value beyond the OpenMP specification.

In the talk we will explain how different implementation choices impact user experience and performance, either explicitly or due to their interaction with optimizations. In addition to best practices, participants will learn how to interact with the LLVM/Clang compiler to determine how OpenMP directives were implemented and optimized.

Finally, we will give a brief overview of current developments in the LLVM/OpenMP world and how they can enable future exploration.The slides are available here.

*Wednesday, 17 Nov 2021, at 15:00, Hermann Haverkort, University of Bonn,**Space-filling curves for tetrahedral meshes***Abstract**: With computations on adaptively refined meshes, one challenge is to achieve and maintain a good load balancing over multiple processors. A relatively simple and effective solution can be found in using a mesh that follows a fixed recursive tessellation, along with a space-filling curve that visits the tiles of this tessellation one by one. The decision how to cut the two- or higher-dimensional mesh into pieces now reduces to the much simpler decision where to cut the curve. If the curve is well-designed, contiguous sections of the curve are guaranteed to form well-shaped parts of the mesh, with well-shaped boundaries that enable efficient communication between the processors handling such sections.To generate and process meshes of triangles, squares, or cubes, there are a number of well-known space-filling curves with favourable properties. For meshes of tetrahedra or even higher-dimensional simplices the situation is more complicated: the question how to best generalise Polya’s triangle-filling curve to higher dimensions is still open. In this presentation I will present and propose several options, discuss their main properties and explain remaining challenges and open questions.

- Wednesday, 3 Nov 2021, at 15:00, Joseph Schuchart, University of Tennessee, Knoxville,
*Template Task Graphs for Irregular Task-based Applications***Abstract:**MPI and OpenMP still form the dominant programming paradigms for distributed and shared-memory programming and are commonly used in combination. However, more modern, C++ oriented approaches are gaining interest in the community. In this talk, I will present the Template Task Graphs (TTG), a C++-based approach that aims at providing a distribute task-based programming model that is both efficient and composable. By forming an abstract representation of the task-graph, TTG allows for the dynamic unrolling of task-graphs without prior knowledge of their exact shape and is thus especially suitable for irregular applications. I will present the current state of the model and first performance results on benchmarks resembling real-world target applications.

- Wednesday, 27 Oct 2021, at 15:00, Nicole Beisiegel, TU Dublin, An Adaptive Discontinuous Galerkin Model for Coastal Flood Simulations
**Abstract:**Coastal flooding is an inherently complex phenomenon. This poses challenges for computer models with respect to computational efficiency, spatial resolution, or accuracy. In this talk, we will look at an adaptive discontinuous Galerkin (DG) model to simulate storm surge and coastal flooding more generally. A number of idealised testcases demonstrate the model’s performance. The adaptive, triangular mesh is driven by heuristic, or application-based refinement indicators. The discussion of the model’s computational efficiency will be guided by efficiency metrics that we define and apply to model results.This is joint work with J. Behrens (U Hamburg) and C.E. Castro (U Tarapaca).

- Wednesday, 6 Oct 2021, at 15:00, Edmond Chow, Georgia Tech, I
**ntroduction to Asynchronous Iterative Solvers****Abstract:**The standard iterative methods for solving linear and nonlinear systems of equations are all synchronous, meaning that in the parallel execution of these methods where some processors may complete an iteration before other processors (for example, due to load imbalance), the fastest processors must wait for the slowest processors before continuing to the next iteration. This talk will discuss parallel iterative methods that operate asynchronously, meaning that the processors never wait for each other, but instead proceed using whatever iterate values are already available from other processors. Processor idle time is thus eliminated, but questions arise about the convergence of these methods. Asynchronous iterative methods will be introduced using simple fixed-point iterative methods for linear systems, before discussing asynchronous versions of rapidly converging methods, in particular, second order Richardson, and multigrid methods.

### Academic year 2020/2021

- Friday, 27 Aug 2021, at 13:00,
**Adam Tuft**, Durham University,*A Tour of OMPT and Otter for Tracing Task-Centric OpenMP Programs***Abstract:**Reasoning about the structure of task-based programs, while vital for understanding their performance, is challenging for complex programs exhibiting irregular or nested tasking. The new OpenMP Tools (OMPT) interface defines event-driven callbacks allowing tools to gather rich runtime data on task creation and synchronisation. An example of such a tool is Otter (https://github.com/adamtuft/otter) which performs event-based tracing of OpenMP programs through OMPT, allowing the task-based structure of a target program to be recovered. This 30-minute presentation will give a brief tour of OMPT and will demonstrate its utility with examples provided by Otter.

- Thursday, 15 Jul 2021 09:30,
**Georg Hager**,*Performance counter analysis with Likwid and single node performance assessment*

- Friday, Jun 25, 2021, at 13:00, Thomas Weinhart, University of Twente,
*Automated calibration for discrete particle simulations***Abstract:**The Discrete Particle Method (DPM) captures the collective behaviour of a granular material by simulating the kinematics of the individual grains. DPM can provide valuable insights that are difficult to obtain with either experiments or continuum models. However, calibrating the parameters of a DPM model against experimental measurements often takes significant effort and expertise, since automated and systematic calibration techniques are lacking.I will present an automated calibration technique, based on Bayesian filtering: We conduct experimental measurements to determine the material response, then simulate the same process in MercuryDPM, our open-source DPM solver [1], and measure the response of the simulated process. Then we apply a numerical optimisation algorithm to find the DPM parameters for which the response of the experiments and simulations match. This optimisation is done using a probabilistic optimisation technique called GrainLearning [2]. The technique can find local optima in only two to three iterations, even for complex contact models with many microscopic parameters.The technique has already been used in several projects, yielding good results. We present two test cases, one for calibrating a sintering model for 3D printing processes and one for calibrating the bulk response of a sheared granular material, and discuss the results.References:[1] Weinhart, T., Orefice, L., Post, M., et al, Fast, flexible particle simulations – An introduction to MercuryDPM, Computer Physics Communications, 249, 107129 (2020). [2] Cheng, H., Shuku, T., Thoeni, K. et al. Probabilistic calibration of discrete element simulations using the sequential quasi-Monte Carlo filter. Granular Matter 20, 11 (2018).

- Friday, 18 Jun 2021, at 13:00, Alexander Moskovsky, Moscow State University, RSC Group,
*Energy efficiency in HPC***Abstract:**High performance computing (HPC) is a pinnacle of modern computer engineering both regarding software and hardware components. Nowadays, they aggregate a large number of components: nodes, accelerators, storage devices and so on. One of the most acute problems on the hardware side is a rapid growth of energy dissipation and density: modern supercomputers consume megawatts of electric power. On the software side, the system software has to support the configuration of multiple components in concert.The RSC Group is a pioneer in liquid cooling for HPC solutions since the beginning of 2010. Liquid cooling enables 30-40% percent reduction of total supercomputer power consumption in comparison to forced airflow cooling. At the same time, liquid cooled HPC systems can be much more compact. RSC also develops a software stack that enables on-demand configurations for the both computational and storage systems, with support of Lustre and Intel DAOS and other filesystems. Such hyperconverged systems in HPC offer compactness and uniformity in hardware, but they require a software orchestrator to support it’s disaggregated architecture.RSC develops systems in close collaboration with it’s academic partners that inspire and motivate many solutions implemented in production. End-users from Russian Academy of Sciences, major Russian universities and research organisations tackle a wide spectrum of problems ranging from high energy physics to life sciences. RSC systems are present in supercomputer ratings like Top500, Green500, HPCG , IO500 and occupy 25% of Russian Top50 list of the most powerful computing systems.

- Friday, Jun 4, 2021, at 13:00, Benjamin Uekermann, University of Stuttgart,
*preCICE 2 – A Sustainable and User-Friendly Coupling Library***Abstract:**In the last five years, we have turned the coupling library preCICE from a working prototype to a sustainable and user-friendly community software project.In this presentation, I want to tell you about the challenges, success stories, and struggles of this endeavor, besides a brief introduction to the software itself. In particular, I cover documentation, tutorials, testing, integration with external simulation software, funding, and community building. Read more on https://precice.org/.

- Friday, 14 May 2021, at 13:00, Nicole Aretz, RWTH Aachen , Title:
*Sensor selection for linear Bayesian inverse problems with variable model configurations***Abstract:**In numerical simulations, mathematical models such as partial differential equations are widely used to predict the behavior of a physical system. The uncertainty in the prediction caused by unknown parameters can be decreased by incorporating measurement data: by means of Bayesian inversion a posterior probability distribution can be obtained that updates prior information on the uncertain parameters. As experimental data can be expensive, sensor positions need to be chosen carefully to obtain informative data despite a limited budget.In this talk we consider a group of forward models which are characterized through different configurations of the physical system. The configuration is a non-linear influence on the solution, e.g. the geometry or material of an individual work piece in a production chain. Our goal is to choose one set of sensors for the estimation of an uncertain linear influence whose measurement data is informative for all possible configurations. To this end, we identify an observability coefficient that links the experimental design to the covariance of the posterior. We then present a sequential sensor selection algorithm that improves the observability coefficient uniformly for all configurations. Computational feasibility is achieved through model order reduction. In particular, we discuss opportunities and challenges to decrease the computational cost of the inverse problem via the reduced basis method. We demonstrate our results on steady-state heat conduction problems for a thermal block and a geothermal model of the Perth Basin in Western Australia.

- Thursday, 8 Apr 2021, at 13:00,
**Jochim Protze**, Title: Asynchronous MPI communication with OpenMP tasks**Abstract:**Your communication depends on computation results as input? Your computation task depends on data to arrive from a different process? OpenMP task dependencies should allow to express such dependencies. OpenMP 5.0 introduced detached tasks. In combination with MPI detached communication [1], this allows to build task dependency graphs across MPI processes. In this short presentation you will learn how you can integrate MPI detached communication into your project and profit from real asynchronous communication. If you don’t want to use OpenMP tasks, the same approach will also work with C++ futures/promises.

- Friday, 12 Mar 2021, at 13:00,
**Tim Dodwell**, Alan Turing Institute, University of Exeter, Title: Adaptive Multilevel Delayed Acceptance**Abstract:**Uncertainty Quantification through Markov Chain Monte Carlo (MCMC) can be prohibitively expensive for target probability densities with expensive likelihood functions, for instance when the evaluation involves solving a Partial Differential Equation (PDE), as is the case in a wide range of engineering applications. Multilevel Delayed Acceptance (MLDA) with an Adaptive Error Model (AEM) is a novel approach, which alleviates this problem by exploiting a hierarchy of models, with increasing complexity and cost, and correcting the inexpensive models on-the-fly. The method has been integrated within the open-source probabilistic programming package PyMC3 and is available in the latest development version.

- Friday, 5 Feb 2021, at 13:00,
**Andy Davis**, Courant Institute, Title: Super-parameterized numerical methods for the Boltzmann equation modeling Arctic sea ice dynamics**Abstract:**We devise a super-parameterized sea ice model that captures dynamics at multiple spatial and temporal scales. Arctic sea ice contains many ice floes—chunks of ice—whose macro-scale behavior is driven by oceanic/atmospheric currents and floe-floe interaction. There is no characteristic floe size and, therefore, accurately modeling sea ice dynamics requires a multi-scale approach. Our two-tiered model couples basin-scale conservation equations with small-scale particle methods. Unlike many other sea ice models, we do not average quantities of interest (e.g., mass/momentum) over a representative volume element. Instead, we explicitly model small-scale dynamics using the Boltzmann equation, which evolves a probability distribution over position and velocity. In practice, existing numerical methods approximating the Boltzmann equation are computationally intractable when modeling Arctic basin scale dynamics. Our approach decomposes the density function into a mass density that models how ice is distributed in the spatial domain and a velocity density that models the small-scale variation in velocity at a given location. The mass density and macro-scale expected velocity evolve according to a hyperbolic conservation equation. However, the flux term depends on expectations with respect to the velocity density at each spatial point. We, therefore, use particle methods to simulate the conditional density at key locations. We make each particle method independent using a local change of variables that defines micro-scale coordinates. We model small-scale ice dynamics (e.g., collision) in this transformed domain.