There has been a recent influx of different processor architecture designs into the market, with many of them targeting HPC applications. When estimating application performance, developers are used to considering the most common figures of merit, …
We present a major update to the GPU-STREAM benchmark implementation, first shown at SC15. The original benchmark allowed comparison of achievable memory bandwidth performance through the STREAM kernels on OpenCL devices. GPU-STREAM v2.0 extends the …
Many scientific codes consist of memory bandwidth bound kernels - the dominating factor of the runtime is the speed at which data can be loaded from memory into the Arithmetic Logic Units. Generally Programmable Graphics Processing Units (GPGPUs) and …