Changes between Version 1 and Version 2 of WorkingGroups/HPC/Mins_sub_2018_06_21
- Timestamp:
- 2018-06-25T13:29:42+02:00 (5 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
WorkingGroups/HPC/Mins_sub_2018_06_21
v1 v2 6 6 == 1. Mixed-precision (BSC) == 7 7 8 Oriol presented the methodology implemented to test the impact of numerical precision in NEMO. An automatic tool to implement the Reduced Precision Emulator to a computational model has been presented, as well as the tests on groups of variables in order to identify which variables can use less precision without affecting the numerical accuracy. Tests have been performed on NEMO 3.6 GYRE1 and showed that only ~10% of the variables need double precision .8 Oriol presented the methodology implemented to test the impact of numerical precision in NEMO. An automatic tool to implement the Reduced Precision Emulator to a computational model has been presented, as well as the tests on groups of variables in order to identify which variables can use less precision without affecting the numerical accuracy. Tests have been performed on NEMO 3.6 GYRE1 and showed that only ~10% of the variables need double precision (see attached slides) 9 9 10 10 Open questions and future plan: need to test the approach on high-resolution (eddy resolving scale) and to evaluate the error. Which is the tolerance to safely use reduced precision? … … 13 13 == 2. NEMO-DSL (STFC) == 14 14 15 Andy presented the light-DSL approach designed to apply DSL in NEMO. An interface to the PSyclone tool has been implemented in order to apply PSyclone transformations without impacting on the NEMO coding structure rules. The approach would allow to introduce the OpenACC parallelisation through a workflow based on two steps: processing the NEMO code to create an internal representation compliant to the PSyclone tool, manipulating the intermediate code to perform PSyclone transformations .15 Andy presented the light-DSL approach designed to apply DSL in NEMO. An interface to the PSyclone tool has been implemented in order to apply PSyclone transformations without impacting on the NEMO coding structure rules. The approach would allow to introduce the OpenACC parallelisation through a workflow based on two steps: processing the NEMO code to create an internal representation compliant to the PSyclone tool, manipulating the intermediate code to perform PSyclone transformations (see attached slides) 16 16 17 17 Open questions and future plan: considering the integration of optimisations for CPUs … … 20 20 == 3. Benchmark setup (CERFACS-CNRS) == 21 21 22 Eric presented the NEMO benchmark implemented to identify the main bottlenecks to the scalability. Some results on the BENCH-1 on Intel Broadwell system at Meteo-France have been shown comparing the reference simulation (without MPI collectives) with the pseudo double size halo and no-communication experiments. The trend of the time spent waiting for communications and for computation load imbalance has been analysed .22 Eric presented the NEMO benchmark implemented to identify the main bottlenecks to the scalability. Some results on the BENCH-1 on Intel Broadwell system at Meteo-France have been shown comparing the reference simulation (without MPI collectives) with the pseudo double size halo and no-communication experiments. The trend of the time spent waiting for communications and for computation load imbalance has been analysed (see attached slides) 23 23 24 24 Open questions and future plan: extension of the tests to high-resolution configurations