This presentation will describe how OpenMP is used at NERSC. NERSC is the primary supercomputing facility for Office of Science in the US Depart of Energy (DOE). Our next production system will be an Intel Xeon Phi Knights Landing (KNL) system, with 60+ cores per node and 4 hardware threads per core. The recommended programming model is hybrid MPI/OpenMP, which also promotes portability across different system architectures.
OpenMP usage statistics, such as the percentage of codes using OpenMP, typical number of threads used, etc., on current NERSC production systems will be analyzed. We will describe what we tell our users how to use OpenMP efficiently with multiple compilers on various NERSC systems, including how to obtain best process and thread affinity for hybrid MPI/OpenMP, memory locality with NUMA domains, programming tips for adding OpenMP, strategies for improving OpenMP scaling, how to use nested OpenMP, and tools available for OpenMP. Tuning examples with real scientific user codes will also be presented on improving OpenMP performance