GPU-STREAM: Benchmarking the achievable memory bandwidth of Graphics Processing Units

Abstract

Many scientific codes consist of memory bandwidth bound kernels - the dominating factor of the runtime is the speed at which data can be loaded from memory into the Arithmetic Logic Units. Generally Programmable Graphics Processing Units (GPGPUs) and other accelerator devices such as the Intel Xeon Phi offer an increased memory bandwidth over CPU architectures. However, as with CPUs, the peak memory bandwidth is often unachievable in practice and so benchmarks are required to measure a practical upper bound on expected performance. We present GPU-STREAM as an auxiliary tool to the standard STREAM benchmark to provide cross-platform comparable results of achievable memory bandwidth between multi- and many-core devices. Our poster will present the cross-platform validity of these claims, and also a short quantification on the effect of ECC memory on memory bandwidth.

Publication
International Conference for High Performance Computing, Networking, Storage and Analysis