New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
2021WP/HPC-02_Daley_Tiling (diff) – NEMO

Changes between Version 17 and Version 18 of 2021WP/HPC-02_Daley_Tiling


Ignore:
Timestamp:
2021-06-04T16:25:49+02:00 (3 years ago)
Author:
hadcv
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • 2021WP/HPC-02_Daley_Tiling

    v17 v18  
    3535[http://forge.ipsl.jussieu.fr/nemo/browser/NEMO/branches/2021/dev_r14393_HPC-03_Mele_Comm_Cleanup?rev=14776 dev_r14393_HPC-03_Mele_Comm_Cleanup@14776] has been merged into the branch. 
    3636 * [http://forge.ipsl.jussieu.fr/nemo/changeset?sfp_email=&sfph_mail=&reponame=&new=14805%40NEMO%2Fbranches%2F2021%2Fdev_r14273_HPC-02_Daley_Tiling/src/OCE&old=14776%40NEMO%2Fbranches%2F2021%2Fdev_r14393_HPC-03_Mele_Comm_Cleanup%2Fsrc%2FOCE Difference@14805 vs dev_r14393_HPC-03_Mele_Comm_Cleanup] 
     37 
     38The following bug fixes were applied to the trunk post-merge: 
     39 * [14840]- Add `ln_tile` to ORCA2_ICE_PISCES/namelist_cfg  
     40 * [14845]- Fix diagnostics preventing ORCA2_ICE_PISCES running with `nn_hls = 2` and tiling 
     41 * [14857]- Fixes in MY_SRC for `nn_hls = 2`/tiling and traadv_fct.F90 for `nn_hls = 1` 
     42 * [14882]- Fix diagnostics preventing ORCA2_ICE_PISCES running with `nn_hls = 2` and tiling; r14845 missing pieces 
     43 * [14903]- Fix bug with A1Di/A1Dj/A2D macros, update standard tiling namelists 
    3744 
    3845==== Changes to tiling framework 
     
    296303 * `nn_hls = 2` (`USING_EXTRA_HALO="yes"`) and `ln_tile = .true.` (using default 10i x 10j tile sizes) 
    297304 
    298 and are compared with results from the [http://forge.ipsl.jussieu.fr/nemo/browser/NEMO/trunk?rev=14820 trunk@14820] with `nn_hls = 1`. 
    299  
    300 The Intel compiler (ifort 18.0.5 20180823) is used with XIOS ([http://forge.ipsl.jussieu.fr/ioserver/browser/XIOS/trunk?rev=2131 r2131 of the trunk]) in detached mode. 
     305and are compared with results from the [http://forge.ipsl.jussieu.fr/nemo/browser/NEMO/trunk?rev=14820 trunk@14820] with `nn_hls = 1` and the same settings for `NOT_USING_QCO`/`USING_ICEBERGS`. 
     306 
     307The Intel compiler (ifort 18.0.5 20180823, `XC40_METO_IFORT` arch file) is used with XIOS ([http://forge.ipsl.jussieu.fr/ioserver/browser/XIOS/trunk?rev=2131 r2131 of the trunk]) in detached mode. 
    301308 
    302309All tests (including SWG) pass, but it should be noted that the `USING_EXTRA_HALO` option is only used by ORCA2_ICE_PISCES.  
     
    339346=== SETTE (post merge) 
    340347 
     348The SETTE tests have been repeated with the [http://forge.ipsl.jussieu.fr/nemo/browser/NEMO/trunk?rev=14922 trunk@14922] in order to include bug fixes that allow all SETTE tests to be run with `nn_hls = 2` and tiling. 
     349 
     350The tests are the same as detailed above except: 
     351 
     352 * The [http://forge.ipsl.jussieu.fr/nemo/browser/NEMO/trunk?rev=14922 trunk@14922] is used (but still compared with results from the [http://forge.ipsl.jussieu.fr/nemo/browser/NEMO/trunk?rev=14820 trunk@14820]) 
     353 * [http://forge.ipsl.jussieu.fr/nemo/browser/utils/CI/sette?rev=14844 SETTE@14844] is used 
     354 * Additional tests with `key_loop_fusion` have been performed 
     355 * `nn_hls = 2` is set directly in namelist_ref, instead of via `USING_EXTRA_HALO`, in order to run all SETTE tests with the extended haloes (and tiling) 
     356 * The default tile size in namelist_ref is 99999i x 10j (to ensure there is always only 1 tile in i) 
     357 * Icebergs are not activated 
     358 
     359All SETTE tests pass and give the same results as the [http://forge.ipsl.jussieu.fr/nemo/browser/NEMO/trunk?rev=14820 trunk@14820], except AGRIF_DEMO which differs after 17 timesteps for all `nn_hls = 2` tests. 
     360This is thought to be because one of the AGRIF domains in this configuration is not large enough for `nn_hls = 2`. 
     361 
     362==== Regular checks 
     363 
     364All checks are the same as before, but the run time/memory changes are significant in some cases.  
     365These are reported here for increases in time/memory larger than 10% that are present in both REPRO experiments of a configuration: 
     366 
     367  * QCO, `nn_hls == 1` 
     368    * No significant changes 
     369  * QCO, `nn_hls == 2` 
     370    * GYRE_PISCES: time + 13-18%, memory + 13-18% 
     371  * QCO, `nn_hls == 2` and `ln_tile = .true.` 
     372    * AMM12: memory + 18% 
     373    * WED025: memory + 17% 
     374  * QCO, loop fusion and `nn_hls == 2` 
     375    * AMM12: time + 20% 
     376  * QCO, loop fusion, `nn_hls == 2` and `ln_tile = .true.` 
     377    * AGRIF_DEMO: time + 11-15% 
     378    * AMM12: memory + 17-20% 
     379    * WED025: memory + 19% 
     380 
     381  * non-QCO, `nn_hls == 1` 
     382    * No significant changes 
     383  * non-QCO, `nn_hls == 2` 
     384    * No significant changes 
     385  * non-QCO, `nn_hls == 2` and `ln_tile = .true.` 
     386    * AGRIF_DEMO: memory + 13% 
     387    * AMM12: memory + 18-20% 
     388    * GYRE_PISCES: time + 11-24% 
     389    * ORCA2_ICE_OBS: memory + 12-16% 
     390    * WED025: memory + 15-16% 
     391  * non-QCO, loop fusion and `nn_hls == 2` 
     392    * ORCA2_ICE_OBS: time + 11-17% 
     393  * non-QCO, loop fusion, `nn_hls == 2` and `ln_tile = .true.` 
     394    * AGRIF_DEMO: memory + 11-12% 
     395    * AMM12: memory + 21-23% 
     396    * WED025: memory + 17-19% 
     397 
     398The time increases do not seem consistent enough to indicate a systematic issue.  
     399However, there is evidence to suggest that tiling increases the memory cost of AGRIF_DEMO (11-13%), AMM12 (17-23%) and WED025 (15-19%). 
     400This is partly due to the use of `nn_hls = 2`, which increases the domain size, but in AMM12 & WED025 this is only responsible for up to 7% of the increased memory cost. 
    341401 
    342402=== Development testing