2

Reviewing the Computational Performance of Structured and Unstructured Grid Deterministic SN Transport Sweeps on Many-core Architectures

In recent years the computer processors underpinning the large, distributed, workhorse computers used to solve the Boltzmann transport equation have become ever more parallel and diverse. Traditional CPU architectures have increased in core count, …

A performance analysis of the first generation of HPC-optimized Arm processors

In this paper, we present performance results from Isambard, the first production supercomputer to be based on Arm CPUs that have been optimized specifically for HPC. Isambard is the first Cray XC50 “Scout” system, combining Cavium ThunderX2 …

Evaluating attainable memory bandwidth of parallel programming models via BabelStream

Many scientific codes consist of memory bandwidth bound kernels. One major advantage of many-core devices such as general purpose graphics processing units (GPGPUs) and the Intel Xeon Phi is their focus on providing increased memory bandwidth over …

An Improved Parallelism Scheme for Deterministic Discrete Ordinates Transport

In this paper we demonstrate techniques for increasing the node-level parallelism of a deterministic discrete ordinates neutral particle transport algorithm on a structured mesh to exploit many-core technologies. Transport calculations form a large …