Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Dec 18, 2013
In this lecture, I describe how to port a larger code to run on a GPU using OpenACC. This is done using the example of the NPB MG code. I show how to use loop-level profiling with the Cray compiler and CrayPAT tool to understand application structure and identify suitable accelerator kernels. I then port the entire code step-by-step, showing performance data and profiles for each step. I show how to identify and avoid common performance bottlenecks.
Programming for GPUs Course: Introduction to OpenACC 2.0 & CUDA 5.5 - December 4-6, 2013