Reviewing the Computational Performance of Structured and Unstructured Grid Deterministic SN Transport Sweeps on Many-core Architectures


In recent years the computer processors underpinning the large, distributed, workhorse computers used to solve the Boltzmann transport equation have become ever more parallel and diverse. Traditional CPU architectures have increased in core count, reduced in clock speed and gained a deep memory hierarchy. Multiple processor vendors offer a collectively diverse range of both CPUs and GPUs, with the architectures used in the fastest machines in the world ever growing in diversity of many-core architectures. Going forward, the landscape of processor technologies will require our codes to function well across multiple architectures. This ever increasing range of architectures represents a unique challenge for solving the Boltzmann equation using deterministic methods in particular, and so it is important to characterize the performance of those key algorithms across the processor spectrum. The solution of the transport equation is computationally expensive, and so we require well optimized and highly parallel solver implementations in order to solve interesting problems quickly. In this work we explore the performance profiles of deterministic SN transport sweeps for both 3D structured (Cartesian) and unstructured (hexahedral) meshes. The study focuses on the characteristics of computational performance which are responsible for the actual performance of a transport solver.

Journal of Computational and Theoretical Transport (special issue)