Version 2 (modified by mikebell, 4 years ago) (diff)

NEMO HPC WG: Fri 30 Sept 2016

Attending: Lucien Anton (Cray), Jeremy Appleyard (Invidia), Mike Bell (Met Office, chair), Miguel Castrillo (BSC), Maff Glover (Met Office), Tim Graham (Met Office), Dmitry Kuts (Intel), Claire Levy (CNRS), Silvia Mocaverro (CMCC), Andy Porter (STFC), Stan Posey (Invidia), Martin Schreiber (Uni Exeter), Julien le Sommer (CNRS), Oriol Tinto (BSC)

1. Actions from previous meetings

Martin, Tim, Silvia, Claire, Miguel and Andy to agree details of the timing interface. Done

Tim and Silvia to gather information on the NEMO memory leak issue. Silvia sent out a summary prior to the meeting. Done

2. Progress setting up standard configurations for performance testing (sub-group)

The GYRE and ORCA025/LIM benchmarks have been set up by Tim and Miguel. The GYRE configuration has an automatic compilation option which will be added to the ORCA025 configuration.

Sylvia and Cyril have performed tests (i) executing the GYRE sequential code on a single core of the node; (ii) executing more than one instance of this sequential code concurrently on the same node (each one on a different core); and have compared the execution times of the two tests. Results show a gap between the two tests (~20% on IB node, ~ 25% on SB node) and reinforce the idea that we should analyze the performance of the code by executing more instances of the sequential code (which allows to avoid the parallelization overhead, but allows to simulate the concurrency among the parallel processes)

Action: Tim to check which routines GYRE uses (to check they cover the important ones) and to consider whether a configuration with a sloping bathymetry should be used.

3. Progress on agreeing performance metrics and single core performance testing (sub-group)

A perf_regions code has been developed by members of the sub-group as a flexible replacement for the timer module calls in NEMO. This utility could be used fairly widely and is available at https://github.com/schreiberx/perf_regions

A script has been written to replace timer calls by perf_regions calls. Tim has tested it with the GYRE configuration and plans to make it work without hand-edits and for other configurations.

Members of the sub-group have access to Sandybridge, Ivybridge and Haswell processors.

Action: Martin, Tim and Dmitri to investigate access to Knightslanding processors.

Dmitry said that we could have access to the Intel’s Endeavour cluster (KNL is in public, so we just need NDA to access).

Action: Silvia and Dmitry to discuss access to KNL processors

There is an issue with over-counting of floating point operations (by up to a factor of 5) in some circumstances on some Intel processors. The sub-group will discuss this further. Andy suggested that a Fortran parser might be used to calculate/estimate the number of floating point operations. We expect to know within a month whether this issue is going to be a serious obstacle.

Lucien suggested that changing the frequency of processors can help to distinguish memory bound and computation bound codes.

4. Memory leaks at NEMO version 3.6 (Silvia, Tim)

The NEMO systems team is seeking some expert advice on the memory leaks. The problem appears to be intermittent, dependent on forcing frequency and to afflict more than one compiler. Martin suggested an approach for generating more information about the leaks.

Action: Claire to arrange a discussion with Martin and relevant NEMO system team members

5. Work/plans on OpenMP directives (Silvia + others ?)

Silvia has implemented OpenMP directives in a development branch for the GYRE and ORCA2/LIM configurations and added some optimisations so that hybrid OpenMP and MPI is faster than MPI on Sandybridge processors. Mondher is testing the directives. The plan is to merge the OpenMP directives into the stable release at the next merge party.

6. Status of proposals for funding

The Copper proposal will not be funded.

ESI-WACE is looking to assess the performance of a 1 km global NEMO demonstrator but does not have resources ear-marked to set one up.

IS-ENES2 now includes a small item of joint work between STFC and CMCC on PSyKAl.

Jean-Marc Molines did a lot of work assessing and improving the computational performance of the Grenoble 1NM north Atlantic NEMO configuration. IO bottle-necks were a particular issue.

Action: Julien to circulate the reports on performance testing of the Grenoble 1NM north Atlantic NEMO configuration.

7. NEMO development strategy

A NEMO development strategy was published in 2014. It established points of concensus and points where more discussion is needed and has guided the work of the NEMO Systems Team. The 2017 version of the strategy will be discussed at the NEMO developers’ meetings (a) at the end of October (b) in mid Dec – early Jan and © in March.

Questions considered in the HPC chapter should include: suitable targets and expectations for IO performance; the impact of AGRIF on HPC performance.

The members of this group should review the HPC chapter and agree where there is concensus. The sub-group will lead the writing.

8. AOB

The next meeting should review progress on the message passing actions agreed in the early meetings of the group.

The sub-group should consider which actions on performance assessment could appear in the 2017 NEMO Systems Team plan.

9. Date for next meeting

Action: Mike to call the next meeting before the NEMO developers’ committee in mid Dec – early Jan. Mike will do a Doodle poll once that date is set.