# Seminar Series

The Scientific Computing seminar series usually takes place on Wednesdays at 15:00. Join on zoom or in the VisLab (MCS1022).

To sign up for the seminar mailing list please email [email protected]

Contact: [email protected]

## Upcoming events

*(Talk takes place in***RH001**)

Wednesday, May 11, 15:00, Spencer Sherwin, Imperial College London, Title: Industry-Relevant implicit Large-Eddy Simulation of flows past automative and racing cars using Spectral/hp Element Methods**Abstract**: We present the successful deployment of high-fidelity Large-Eddy Simulation (LES) technologies based on spectral/hp element methods to industrial flow problems that are characterized by high Reynolds numbers and complex geometries. In particular, we describe the steps required to perform the implicit LES of a realistic automotive and racing cars. Notable developments had to be made in order to overcome obstacles in both mesh generation and solver technologies to simulate these flows, and will be outlined in this presentation. We thereby hope to demonstrate a viable pathway to translate academic developments into industrial tools, that can advance the analysis and design capabilities of high-end engineering users.- Wednesday, May 18, 15:00, Fabian Knorr, University of Innsbruck, Title: Floating-point compressors
**Abstract:**Storing and exchanging large amounts of floating-point data is common in distributed scientific computing applications. Data compression, when fast enough, can speed up such workloads by reducing contention on interconnects and storage systems. This talk explores two classes of floating-point compressors, the lossy and lossless type, and discusses their their utility on modern parallel and accelerator-based systems. We show which approach is best suited for what problem formulation and take a close look at ndzip, a lossless compressor for dense multi-dimensional data that is specifically engineered to achieve maximum efficiency on GPU-accelerated hardware.

### Events Next Term

- Nils Wedi, ECMWF, title: TBD
- This talk has been moved to next academic year, it will take place some time In October.

Jenny Jenkins, Durham University, Title:*What in Earth are Ultra-low Velocity Zones? The advances and continuing difficulties of imaging small-scale deep structure within our planet***Abstract**: Within the complex landscape of the Earth’s core-mantle boundary, no feature is more extreme, or more poorly understood, than ultra-low velocity zones (ULVZs). While most material within the Earth causes changes in earthquake wave speeds of only several percent, these small piles of material (100s km wide by 10s km high) show reductions of 10-50%. While several hypotheses have been proposed to explain these extreme observations, what exactly ULVZs are made of, and what affect these small piles of material have on surrounding mantle processes is still an open question. In this talk I’ll present new data that was used to map out the ULVZ that sits beneath Hawaii and discuss some of the current limitations of global high-frequency full waveform modelling that make it difficult to interpret/model these observations.

### Past events

- Wednesday, May 4, 15:00, Joshua Short, Boston Limited, Title: NVIDIA and The Emergence of the Virtual World – Exploring The Omniverse, Digital Twins, and Cloud XR
**Abstract:**With virtual reality becoming more integrated into the fabric of human existence, it’s evident that virtual existence is evolving and altering the social elements of life in new ways. But what benefits can virtual reality present to the world on a research or commercial level?

We will be discussing how the Omniverse, Digital Twins, and Cloud XR technologies are being used in a variety of use cases, and how they can positively impact the future of not just virtual reality, but the physical world too. - Wednesday, April 27, 15:00, Carola Kruse, CERFACS, Title: On the efficient solution of saddle point systems with an inner-outer iterative solver based on the Golub-Kahan bidiagonalization
**Abstract**: Symmetric indefinite linear systems of saddle point type arise in a large variety of applications, for example in fluid dynamics or structural mechanics. In this talk, we will review an inner-outer iterative solver for saddle point problems that is based on the Golub-Kahan bidiagonalization. The difficulty in the proposed solver is that in each outer loop an inner iterative system M, say, of size of the (1,1)-block has to be solved. If M arises from the discretization of a partial differential equation, efficient solvers might be available. Here, we will focus on the Stokes equation as a test problem and present different strategies for reducing the overall number of inner iterations and the computation time. - Wednesday, March 23, 15:00, Nils Deppe, California Institute of Technology & Lawrence Kidder, Cornell University, Title: SpECTRE: A task-based framework for astrophysics
**Abstract**: Astrophysical phenomena vary greatly in spatial and temporal scales while also requiring complicated physics like neutrino transport. SpECTRE is a next-generation open-source (github.com/sxs-collaboration/spectre/) code designed with modern algorithms and computer science practices in order to take advantage of exascale supercomputers. We will discuss how SpECTRE uses the open-source task-based parallelization framework Charm++ (github.com/UIUC-PPL/charm/) to realize task-based parallelism. We will provide details on how our discontinuous Galerkin-finite-difference hybrid and elliptic solver algorithms are translated into the language of task-based parallelism, including the use of SIMD intrinsics and lazy evaluation. Time permitting, we will also provide an overview of our tensor library that allows scientists to write equations in a domain specific language, that of general relativity/gravity.

- Wednesday, March 9
**(updated date)**, 15:00, Rosa Badia, Barcelona Supercomputing Center, Title: Parallel programming with PyCOMPSs and the dislib ML library**Abstract:**The seminar will present our group research on parallel programming models, more specifically in PyCOMPSs. PyCOMPSs is a task-based programming model for distributed computing platforms. In the seminar we will present the basics of the programming model and of its runtime. The seminar will also include some of our recent research work in the parallelization of machine learning with the dislib library. Dislib is a distributed, parallel machine learning library that offers a syntax inspired in scikit learn and it is parallelized with PyCOMPSs.

- Wednesday, January 26, 15:00, Linus Seelinger, Heidelberg University, Title:
*Bridging the gap: Advanced uncertainty quantification and challenging models***Abstract:**Simulations of complex real-world processes, often by means of partial differential equations, lead to computational challenges. There is a rich ecosystem of methods and software packages to address these. The treatment of uncertainties in these models further increases problem dimensionality. However, the software ecosystem in the field of uncertainty quantification (UQ) is far less mature.

This talk addresses the resulting gap between advanced UQ methods and challenging models in three ways:

– An introduction to the MIT Uncertainty Quantification library (MUQ) is given. MUQ provides a modular framework for building UQ applications, offering numerous existing methods and reusable components.

– A new universal interface for coupling UQ and model software is presented. Based on a simple HTTP protocol, it fully decouples development on both sides, while containerization provides portability.

– A parallelized multilevel UQ method for high performance applications developed in MUQ is demonstrated at the example of a large-scale tsunami model. - Wednesday,
**January 12**, 15:00, Johannes Doerfert, Argonne National Laboratory, Title:*O*penMP in LLVM — Behind the Pragmas

Abstract**:**OpenMP in LLVM is more than the directive handling in the frontends Clang and Flang. LLVM ships with various OpenMP specific compiler optimizations for a while now, and more are to come. There is a myriad of OpenMP runtimes to orchestrate portable accelerator offloading behind the scenes and to provide improved value beyond the OpenMP specification.

In the talk we will explain how different implementation choices impact user experience and performance, either explicitly or due to their interaction with optimizations. In addition to best practices, participants will learn how to interact with the LLVM/Clang compiler to determine how OpenMP directives were implemented and optimized.

Finally, we will give a brief overview of current developments in the LLVM/OpenMP world and how they can enable future exploration.

The slides are available here. *Wednesday, Nov 17, 15:00, Hermann Haverkort, University of Bonn,**Space-filling curves for tetrahedral meshes***Abstract**: With computations on adaptively refined meshes, one challenge is to achieve and maintain a good load balancing over multiple processors. A relatively simple and effective solution can be found in using a mesh that follows a fixed recursive tessellation, along with a space-filling curve that visits the tiles of this tessellation one by one. The decision how to cut the two- or higher-dimensional mesh into pieces now reduces to the much simpler decision where to cut the curve. If the curve is well-designed, contiguous sections of the curve are guaranteed to form well-shaped parts of the mesh, with well-shaped boundaries that enable efficient communication between the processors handling such sections.

To generate and process meshes of triangles, squares, or cubes, there are a number of well-known space-filling curves with favourable properties. For meshes of tetrahedra or even higher-dimensional simplices the situation is more complicated: the question how to best generalise Polya’s triangle-filling curve to higher dimensions is still open. In this presentation I will present and propose several options, discuss their main properties and explain remaining challenges and open questions.- Wednesday, Nov 3, 2021 15:00, Joseph Schuchart, University of Tennessee, Knoxville,
*Template Task Graphs for Irregular Task-based Applications***Abstract:**MPI and OpenMP still form the dominant programming paradigms for distributed and shared-memory programming and are commonly used in combination. However, more modern, C++ oriented approaches are gaining interest in the community. In this talk, I will present the Template Task Graphs (TTG), a C++-based approach that aims at providing a distribute task-based programming model that is both efficient and composable. By forming an abstract representation of the task-graph, TTG allows for the dynamic unrolling of task-graphs without prior knowledge of their exact shape and is thus especially suitable for irregular applications. I will present the current state of the model and first performance results on benchmarks resembling real-world target applications. - Wednesday, Oct 27, 2021 15:00, Nicole Beisiegel, TU Dublin, An Adaptive Discontinuous Galerkin Model for Coastal Flood Simulations
**Abstract:**Coastal flooding is an inherently complex phenomenon. This poses challenges for computer models with respect to computational efficiency, spatial resolution, or accuracy. In this talk, we will look at an adaptive discontinuous Galerkin (DG) model to simulate storm surge and coastal flooding more generally. A number of idealised testcases demonstrate the model’s performance. The adaptive, triangular mesh is driven by heuristic, or application-based refinement indicators. The discussion of the model’s computational efficiency will be guided by efficiency metrics that we define and apply to model results.

This is joint work with J. Behrens (U Hamburg) and C.E. Castro (U Tarapaca). - Wednesday, Oct 6, 2021 15:00, Edmond Chow, Georgia Tech, I
**ntroduction to Asynchronous Iterative Solvers****Abstract:**The standard iterative methods for solving linear and nonlinear systems of equations are all synchronous, meaning that in the parallel execution of these methods where some processors may complete an iteration before other processors (for example, due to load imbalance), the fastest processors must wait for the slowest processors before continuing to the next iteration. This talk will discuss parallel iterative methods that operate asynchronously, meaning that the processors never wait for each other, but instead proceed using whatever iterate values are already available from other processors. Processor idle time is thus eliminated, but questions arise about the convergence of these methods. Asynchronous iterative methods will be introduced using simple fixed-point iterative methods for linear systems, before discussing asynchronous versions of rapidly converging methods, in particular, second order Richardson, and multigrid methods. - Friday, Aug 27, 2021 13:00,
**Adam Tuft**, Durham University,*A Tour of OMPT and Otter for Tracing Task-Centric OpenMP Programs***Abstract:**Reasoning about the structure of task-based programs, while vital for understanding their performance, is challenging for complex programs exhibiting irregular or nested tasking. The new OpenMP Tools (OMPT) interface defines event-driven callbacks allowing tools to gather rich runtime data on task creation and synchronisation. An example of such a tool is Otter (https://github.com/adamtuft/otter) which performs event-based tracing of OpenMP programs through OMPT, allowing the task-based structure of a target program to be recovered. This 30-minute presentation will give a brief tour of OMPT and will demonstrate its utility with examples provided by Otter.

- Thursday, Jul 15, 2021 09:30,
**Georg Hager**,*Performance counter analysis with Likwid and single node performance assessment* - Friday, Jun 25, 2021 13:00, Thomas Weinhart, University of Twente,
*Automated calibration for discrete particle simulations***Abstract:**The Discrete Particle Method (DPM) captures the collective behaviour of a granular material by simulating the kinematics of the individual grains. DPM can provide valuable insights that are difficult to obtain with either experiments or continuum models. However, calibrating the parameters of a DPM model against experimental measurements often takes significant effort and expertise, since automated and systematic calibration techniques are lacking.I will present an automated calibration technique, based on Bayesian filtering: We conduct experimental measurements to determine the material response, then simulate the same process in MercuryDPM, our open-source DPM solver [1], and measure the response of the simulated process. Then we apply a numerical optimisation algorithm to find the DPM parameters for which the response of the experiments and simulations match. This optimisation is done using a probabilistic optimisation technique called GrainLearning [2]. The technique can find local optima in only two to three iterations, even for complex contact models with many microscopic parameters.The technique has already been used in several projects, yielding good results. We present two test cases, one for calibrating a sintering model for 3D printing processes and one for calibrating the bulk response of a sheared granular material, and discuss the results.References:[1] Weinhart, T., Orefice, L., Post, M., et al, Fast, flexible particle simulations – An introduction to MercuryDPM, Computer Physics Communications, 249, 107129 (2020). [2] Cheng, H., Shuku, T., Thoeni, K. et al. Probabilistic calibration of discrete element simulations using the sequential quasi-Monte Carlo filter. Granular Matter 20, 11 (2018). - Friday, Jun 18, 2021 13:00, Alexander Moskovsky, Moscow State University, RSC Group,
*Energy efficiency in HPC***Abstract:**High performance computing (HPC) is a pinnacle of modern computer engineering both regarding software and hardware components. Nowadays, they aggregate a large number of components: nodes, accelerators, storage devices and so on. One of the most acute problems on the hardware side is a rapid growth of energy dissipation and density: modern supercomputers consume megawatts of electric power. On the software side, the system software has to support the configuration of multiple components in concert.The RSC Group is a pioneer in liquid cooling for HPC solutions since the beginning of 2010. Liquid cooling enables 30-40% percent reduction of total supercomputer power consumption in comparison to forced airflow cooling. At the same time, liquid cooled HPC systems can be much more compact. RSC also develops a software stack that enables on-demand configurations for the both computational and storage systems, with support of Lustre and Intel DAOS and other filesystems. Such hyperconverged systems in HPC offer compactness and uniformity in hardware, but they require a software orchestrator to support it’s disaggregated architecture.RSC develops systems in close collaboration with it’s academic partners that inspire and motivate many solutions implemented in production. End-users from Russian Academy of Sciences, major Russian universities and research organisations tackle a wide spectrum of problems ranging from high energy physics to life sciences. RSC systems are present in supercomputer ratings like Top500, Green500, HPCG , IO500 and occupy 25% of Russian Top50 list of the most powerful computing systems. - Friday, Jun 4, 2021 13:00, Benjamin Uekermann, University of Stuttgart,
*preCICE 2 – A Sustainable and User-Friendly Coupling Library***Abstract:**In the last five years, we have turned the coupling library preCICE from a working prototype to a sustainable and user-friendly community software project.In this presentation, I want to tell you about the challenges, success stories, and struggles of this endeavor, besides a brief introduction to the software itself. In particular, I cover documentation, tutorials, testing, integration with external simulation software, funding, and community building. Read more on https://precice.org/. - Friday, May 14, 2021 13:00, Nicole Aretz, RWTH Aachen , Title:
*Sensor selection for linear Bayesian inverse problems with variable model configurations***Abstract:**In numerical simulations, mathematical models such as partial differential equations are widely used to predict the behavior of a physical system. The uncertainty in the prediction caused by unknown parameters can be decreased by incorporating measurement data: by means of Bayesian inversion a posterior probability distribution can be obtained that updates prior information on the uncertain parameters. As experimental data can be expensive, sensor positions need to be chosen carefully to obtain informative data despite a limited budget.In this talk we consider a group of forward models which are characterized through different configurations of the physical system. The configuration is a non-linear influence on the solution, e.g. the geometry or material of an individual work piece in a production chain. Our goal is to choose one set of sensors for the estimation of an uncertain linear influence whose measurement data is informative for all possible configurations. To this end, we identify an observability coefficient that links the experimental design to the covariance of the posterior. We then present a sequential sensor selection algorithm that improves the observability coefficient uniformly for all configurations. Computational feasibility is achieved through model order reduction. In particular, we discuss opportunities and challenges to decrease the computational cost of the inverse problem via the reduced basis method. We demonstrate our results on steady-state heat conduction problems for a thermal block and a geothermal model of the Perth Basin in Western Australia. - Thursday, Apr 8, 2021 13:00,
**Jochim Protze**, Title: Asynchronous MPI communication with OpenMP tasks**Abstract:**Your communication depends on computation results as input? Your computation task depends on data to arrive from a different process? OpenMP task dependencies should allow to express such dependencies. OpenMP 5.0 introduced detached tasks. In combination with MPI detached communication [1], this allows to build task dependency graphs across MPI processes. In this short presentation you will learn how you can integrate MPI detached communication into your project and profit from real asynchronous communication. If you don’t want to use OpenMP tasks, the same approach will also work with C++ futures/promises.

Zoom link for this session: https://durhamuniversity.zoom.us/j/97425330730?pwd=Ti92aXRKSXRmN2FPZmNTazdoVEl0QT09 - Friday, Mar 12, 2021 13:00,
**Tim Dodwell**, Alan Turing Institute, University of Exeter, Title: Adaptive Multilevel Delayed Acceptance**Abstract:**Uncertainty Quantification through Markov Chain Monte Carlo (MCMC) can be prohibitively expensive for target probability densities with expensive likelihood functions, for instance when the evaluation involves solving a Partial Differential Equation (PDE), as is the case in a wide range of engineering applications. Multilevel Delayed Acceptance (MLDA) with an Adaptive Error Model (AEM) is a novel approach, which alleviates this problem by exploiting a hierarchy of models, with increasing complexity and cost, and correcting the inexpensive models on-the-fly. The method has been integrated within the open-source probabilistic programming package PyMC3 and is available in the latest development version - Friday, Feb 5, 2021 13:00,
**Andy Davis**, Courant Institute, Title: Super-parameterized numerical methods for the Boltzmann equation modeling Arctic sea ice dynamics**Abstract:**We devise a super-parameterized sea ice model that captures dynamics at multiple spatial and temporal scales. Arctic sea ice contains many ice floes—chunks of ice—whose macro-scale behavior is driven by oceanic/atmospheric currents and floe-floe interaction. There is no characteristic floe size and, therefore, accurately modeling sea ice dynamics requires a multi-scale approach. Our two-tiered model couples basin-scale conservation equations with small-scale particle methods. Unlike many other sea ice models, we do not average quantities of interest (e.g., mass/momentum) over a representative volume element. Instead, we explicitly model small-scale dynamics using the Boltzmann equation, which evolves a probability distribution over position and velocity. In practice, existing numerical methods approximating the Boltzmann equation are computationally intractable when modeling Arctic basin scale dynamics. Our approach decomposes the density function into a mass density that models how ice is distributed in the spatial domain and a velocity density that models the small-scale variation in velocity at a given location. The mass density and macro-scale expected velocity evolve according to a hyperbolic conservation equation. However, the flux term depends on expectations with respect to the velocity density at each spatial point. We, therefore, use particle methods to simulate the conditional density at key locations. We make each particle method independent using a local change of variables that defines micro-scale coordinates. We model small-scale ice dynamics (e.g., collision) in this transformed domain.