Changes between Version 3 and Version 4 of Working Groups/HPC/Mins_2017_03_20


Ignore:
Timestamp:
2017-10-03T16:51:42+02:00 (3 years ago)
Author:
mocavero
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Working Groups/HPC/Mins_2017_03_20

    v3 v4  
    1717'''Action''': Silvia to analyze the code modifications between the two versions. 
    1818 
    19 == 2.   Investigations of single core performance (Martin, Silvia, Tim) == 
     19== 2.   Investigations of single core performance == 
    2020  
    2121Martin provides an update on the activity: after the integration of the perf_regions tool within NEMO, Tim wrote a script to extract the metrics measurement. We have info on timing and on cache performance for each routine. Even if there is a mismatch between the cache performance and the measured bandwidth, some preliminary results could be presented during the meeting in Barcelona 
     
    2424  
    2525 
    26 == 3.   Updates at the NEMO merge party and to the NEMO trunk (Silvia) == 
     26== 3.   Updates at the NEMO merge party and to the NEMO trunk == 
    2727 
    2828The hybrid parallel version has not been integrated in the trunk since some ST developers have expressed concerns about the code complexity, also considering the limited performance gain introduced by the OpenMP approach. The current OpenMP parallelization is fine-grain. Alternative parallelization approaches (e.g. coarse-grain, tiling) and their impacts on code performance and readability will be tested with a continuous feedback from the ST. 
     
    3636 
    3737 
    38 == 4.   PSyclone/NEMO update (Andy) == 
     38== 4.   PSyclone/NEMO update == 
    3939 
    4040Since the Psyclone approach could be a bit invasive for the NEMO code, a new approach (based on DSL concept) has been tested within the IS-ENES2 project by STFC and is available on a github repository (DSL project). It is based on the development of a separate kernel for each loop on the grid points in the advection kernel implemented by CMCC, so that OpenMP, or OpenACC, or cache tiling can be implemented at kernel level to provide performance portability. 
     
    4848 
    4949 
    50 == 5.   Issues to be discussed in Barcelona (all) == 
     50== 5.   Issues to be discussed in Barcelona == 
    5151  
    5252There are three talks scheduled for the HPC session of the meeting in Barcelona: a first talk on the HPC-WG activities, with a focus on the main results and the discussion on the readability/performance trade-off; a second talk on the BGC HPC issues and the third talk on single-precision work, proposed by BSC.