Evaluating ISO C++ Parallel Algorithms on Heterogeneous HPC Systems
International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems held in conjunction with Supercomputing (PMBS), 2022
Abstract
Recent revisions to the ISO C++ standard have added specifications for parallel algorithms. These additions cover common use-cases, including sequence traversal, reduction, and even sorting, many of which are highly applicable in HPC, and thus represent a potential for increased performance and productivity. This study evaluates the state of the art for implementing het- erogeneous HPC applications using the latest built-in ISO C++17 parallel algorithms. We implement C++17 ports of representative HPC mini-apps that cover both compute-bound and memory bandwidth-bound applications. We then conduct benchmarks on CPUs and GPUs, comparing our ports to other widely-available parallel programming models, such as OpenMP, CUDA, and SYCL. Finally, we show that C++17 parallel algorithms are able to achieve competitive performance across multiple mini-apps on many platforms, with some notable exceptions. We also discuss several key topics, including productivity, and describe workarounds for a number of remaining issues, including index- based traversal and accelerator device/memory management.
@inproceedings{pmbs22-cpp,
author = {Lin, Wei-Chen and Deakin, Tom and McIntosh-Smith, Simon},
title = {{Evaluating ISO C++ Parallel Algorithms on Heterogeneous HPC Systems}},
booktitle = {{International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems held in conjunction with Supercomputing (PMBS)}},
year = {2022},
publisher = {{IEEE}},
keywords = {Conferences and Workshops},
doi = {10.1109/PMBS56514.2022.00009},
pdf = {https://research-information.bris.ac.uk/en/publications/evaluating-iso-c-parallel-algorithms-on-heterogeneous-hpc-systems}
}