New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
2020WP/KERNEL-06_techene_better_e3_management (diff) – NEMO

Changes between Version 11 and Version 12 of 2020WP/KERNEL-06_techene_better_e3_management


Ignore:
Timestamp:
2020-07-20T13:14:14+02:00 (4 years ago)
Author:
techene
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • 2020WP/KERNEL-06_techene_better_e3_management

    v11 v12  
    1 = Name and subject of the action 
     1= Star coordinate faster implementation 
     2 
     3Change the way to deal with the vertical scale factors in NEMO in order to save parallel processing time for z-star coordinate. This modification can be activated through a cpp key : key_qco. 
    24 
    35Last edition: '''[[Wikinfo(changed_ts)]]''' by '''[[Wikinfo(changed_by)]]''' 
     
    1618||=Dependencies || If any                                                || 
    1719||=Branch       || source:/NEMO/branches/2020/dev_r12377_KERNEL-06_techene_e3 || 
     20||=Branch       || source:/NEMO/branches/2020/dev_r13324_KERNEL-06_techene_e3_version2 || 
    1821||=Previewer(s) || Madec, Chanut, Masson                                 || 
    1922||=Reviewer(s)  || Madec                                                 || 
     
    2225=== Description 
    2326 
     27NEMO current version requires memory for scale factor storage e3[P] at P-point computation uses interpolation of the e3t 4D table at P = {u-, v-, w-, f-, uw-, vw-} points. This means 7 4D tables stored in memory. The idea consists in computing scale factors e3[P](ji,jj,jk,Ktl)on the fly with r3[P] = ssh[P] / h_0 and e3[P]_0 instead of using memory. This should help to improve run time when running parrallel. Indeed, processors have as least two memory level : fast memory and slow RAM memory. In parrallel runs the processing time is no longer limited by computation time but by memory access time. That is the reason why trying to minimise memory buffering.  
     28Asselin filter management is done recomputing r3[P] directly with the filtered ssh. 
     29z-tilde management is done through e3[P]_0 that may varies with time in the z-tilde case. 
    2430 
    25 The current e3[P] at P-point computation uses interpolation of the r3t 4D table at P = {u-, v-, w-, f-, uw-, vw-} points. This means 7 4D tables stored in memory.  
    26 The proposed optimisation consists in computing e3[P](ji,jj,jk,Ktl) on the fly using the r3[P] = ssh[P] / h_0 and the e3[P]_0. r3[P] is a 2D table, then this means only 4 2D tables stored in memory. 
    27 z-tilde management is done through e3[P]_0 that may varies with time in the z-tilde case. 
    28 Asselin filter management is done recomputing r3[P] directly with the filtered ssh. 
    2931 
    3032 
     
    3840}}} 
    3941 
    40 Eventually, all the dom_vvl_interpol call are removed, each time e3 is called we use a substitute to replace e3 by e3_0 (1 + ssh / h_0). For backward compatibility a cpp key manages the use of the new version vs. the old version. We will duplicate modules such as step and domvvl into stepLF and domQE (QE stands for Quasi Eulerian) and create a subtitute module.  
     42'''KERNEL-06's version 1 implementation : /NEMO/branches/2020/dev_r12377_KERNEL-06_techene_e3'''  
    4143 
    42 Integrated in mid merge trunk.  
    4344 
    44 List the Fortran modules and subroutines to be created. 
    45 substitute.F90 
     45NEMO's version 12377 implements computation of scale factor at T-point with a leap frog integration or a filter and then scale factors at U-V-W-UW-VW-F-points from T-point interpolation through domvvl.F90 module. 
     46- at initialization or restart at all points 
     47- at time N+1 after sea surface time splitting integration and before the momentum integration at  T-U-W-points 
     48- at time N after the sea surface asselin filtering at  T-U-W-points 
     49- at the end of the time step after index switch F- and W-UW-VW-points are updated accordingly 
     50Because NEMO needs to take into account continuity issues, these modification are implemented under a cpp key key_qco for "quasi eulerian coordinate". When this key_qco is not activated NEMO should be exactly the same as the trunk.  
     51 
     52NEMO intermediate version 1 implements scales factors computed from sea surface interpolation (2d field) instead but the whole structure of the code remains. Note that to validate "NEMO intermediate version 1" we change the code line by line and compare results of GYRE configuration with TOP de-activated. Differences in the results appear when changing the W-point scale factor interpolation from T-point scale factor into the sea surface scaling since the bottom level is not considered in the same way. Indeed for GYRE configuration e3w_0 are not the half sum of e3u_0, so the way it is implemented in the reference version is not convinient...  
     53[Etape 1] 
     54 
     55NEMO intermediate version 2 implements scales factors computed from sea surface interpolation (2d field) instead. The initialisation compute sea surface to h_0 called r3 coefficients (which are 2d). These r3 coefficients are updated after each sea surface modification (after time splitting and asselin filtering) and interpolated at U-V-F-points using new but similar routines as  domvvl routines. An extra substitute routine helps to substitute each e3 to its expression (e3P_0 ( 1 + r3P ) * maskP). Sea surface filtering is displaced before Asselin filtering of speed (u,v) and tracer. This version 2 should give exactly the same results as version 1 and it does ! 
     56[Etapes 2 & 3] 
     57 
     58NEMO intermediate version 3 deals with cleanning the code by adding the substitution and removing e3 computation along the code of OCE. It also takes care of the lines lenghts that should be shorter than 136 caracters, some are missing...  
     59[Etapes 4 & 5] 
     60 
     61In order to take into account the new index/loop management, NEMO intermediate version 4 consists in merging the results with trunk 12698 the resulting revision is 12724. Note that in this new trunk revision Jerome changed the way to deal with Asselin filter (traatf and dynatf), intermediate version 4 needs to adapt accordingly. 
     62[Etape 6] 
     63 
     64NEMO intermediate version 5 implements a clean way to deal with the key_qco and also deals with the removal of gde* and h* of memory. It removes e3 from the whole code, to deal with TOP there is to play with pointer of the sea surface height and change where it is computed in step.  
     65[Etape 7] 
     66 
     67RUN SETTE and deliver version for mid-merge party ! Some silly allocating memory bugs found and a not that silly bug in the implicit mode for SPITZ12 configuration.  
     68[Etape 8, 9 & 11] 
     69 
     70 
     71'''KERNEL-06's version 2 implementation : /NEMO/branches/2020/dev_r13328_KERNEL-06_techene_e3_version2'''  
     72 
     73 
    4674 
    4775''...'' 
     
    6290}}} 
    6391 
     92 
     93 
     94 
     95Eventually, all the dom_vvl_interpol call are removed, each time e3 is called we use a substitute to replace e3 by e3_0 (1 + ssh / h_0). For backward compatibility a cpp key manages the use of the new version vs. the old version. We will duplicate modules such as step and domvvl into stepLF and domQE (QE stands for Quasi Eulerian) and create a subtitute module.  
     96 
     97Integrated in mid merge trunk.  
     98 
     99List the Fortran modules and subroutines to be created. 
     100substitute.F90 
     101 
    64102Step 1 : Check the error for e3t, e3w between the current way to compute e3 at T-, W-point and the proposed way to compute e3 at T-, W-point. 
    65103- prints added with no change in the results 
     
    73111- use a SUBSTITUTE when there are e3 CALL 
    74112- make some changes in step and domQE to have the whole thing consistent 
    75 jpjm1 
     113 
    76114''...'' 
    77115