Star coordinate faster implementation

Change the way to deal with the vertical scale factors in NEMO in order to save parallel processing time for z-star coordinate. This modification can be activated through a cpp key : key_qco.

Last edition: 11/02/20 13:25:16 by techene

The PI is responsible to closely follow the progress of the action, and especially to contact NEMO project manager if the delay on preview (or review) are longer than the 2 weeks expected.

  1. Summary
  2. Preview
  3. Review

Summary

Action optimisation of the vertical scale factor e3 computation
PI(S) Techene, Madec
Digest compute e3 on the fly from e3_0(:,:,:,Ktl) * ( 1 + ssh(:,:,Ktl) / h_0( :,: ) * mask( :,:,: ) instead of storing e3t/u/v/w/f…
Dependencies If any
Branch source:/NEMO/branches/2020/dev_r12377_KERNEL-06_techene_e3
Branch source:/NEMO/branches/2020/dev_r13327_KERNEL-06_2_techene_e3
Previewer(s) Madec, Chanut, Masson
Reviewer(s) Madec
Ticket #2385 #2527 #2555 #2485

Description

In z* vertical configuration, NEMO r12377 uses memory to store and update vertical scale factors e3[P] where P = {t-, u-, v-, w-, f-, uw-, vw-} points at "before", "now" and "after" time steps. This means memory storage 6 x 4D + 1 x 3D tables, memory acces and CPU time for updating 3D scale factors. The code modification consists in computing scale factor e3[P] on the fly using each time it is needed with formula e3[P](Kt) = e3[P]_0 * (1 + r3[P](Kt) * mask[P]) where r3P(Kt) (= ssh[P](Kt) / h_0) is a 2D table computed from ssh update at P = {u-,v-,f-} points accordingly with ssh update along a step. This change is only applied in case key_qco is activated.

Because we reduce the number of tables reached in memory we have a better chance to keep using fast RAM memory. Because we no longer compute 3D interpolation but 2D instead algorithm complexity is smaller and use less CPU time. Both make computation about 10% faster whatever the domain size (tested between 10x10 to 100x100 points per computation node). when cutting communications.

This branche also comes with improvements from KERNEL-07 such as symmetric diffusion tensor #2527 implemented in dynldf_lap_blp used controled by nn_dynldf_typ namelist parameter. It contains dynvor correction for using ln_dynvor_msk and a proper fix for using ENS and ENE with partial steps described in #2555. Finally a new shallow water test case has been added #2485.

Implementation

Describe flow chart of the changes in the code.
List the Fortran modules and subroutines to be created/edited/deleted.
Detailed list of new variables to be defined (including namelists),
give for each the chosen name and description wrt coding rules.

KERNEL-06's version 1 [pre mid-merge 2020] : /NEMO/branches/2020/dev_r12377_KERNEL-06_techene_e3

Changing scale factors affect a huge amount of routines in OCE.

The strategy to reduce the code transformation :

  • New variables are in dom_oce.F90 and then in dommsk.F90 and domain.F90
  • We use a function that replaces each e3. by its full expression (e3.0*(1+r3.*mask.)) when key_qco is activated, for that we add a specific include of a substitute (domzgr_substitute.h90)
  • When we use to update scale factors with respect to ssh we now update ratios r3. = ssh./h.0. (domqco.F90)
  • We also changed robert asselin filtering routines with optimisation (traatf_qco.F90 and dynatf_qco.F90) and proper specific update of filtered ratios r3.f instead of e3.

This strategy has also been applied to other variables varying with ssh./h.0 such as h. or gde.

When key_qco is not activated NEMO should be produce exactly same results as the trunk and passes SETTE version r13167 has been delivered for mid-merge party !

Modification of the code have been implemented step by step. This enabled to stress out some interesting features.

  • e3. expression involves tables of distinct dimension then e3.(:,:,: ) call fails it may be necessary to introduce temporary variables (same for water height expression)
  • "e3. =" is no longer possible
  • e3t/u/v/f modifications did not introduce any difference in the results, e3w modification does because both approaches vvl and qco do not take into account of the bottom level in the same way
  • in GYRE e3w_0 are not the half sum of e3u_0 so the way it is implemented in the reference version is not convenient
  • e3. substitution makes lines longer than 136 character this may be a problem for compilers (most have been checked but not all)
  • ssh filtering has been displaced upper in order to provide filtered r3P in TOP asselin filtering

KERNEL-06's version 2 [post mid-merge 2020] : dev_r13327_KERNEL-06_2_techene_e3

Changing scale factors also affect external package such as OFF, NST, SWE… that needs to be updates accordingly :

  • update routines of NST
  • revise and clean SWE and transfer improvements for dynldf_lap_blp.F90 and dynvor.F90
  • pressure gradient for ISF TO DO !

When key_qco is activated NEMO passes suitable SETTE tests i.e. custom GYRE_PISCES, ORCA2_ICE_PISCES, SPITZ12, AMM12, AGRIF_DEMO, VORTEX. ISOMIP cannot run yet.

Pseudo merge with NEMO/branches/2020/dev_r12527_Gurvan_ShallowWater consists mainly in modifying dynldf_lap_blp for adding symmetric tensor operator and dynvor for cleaning and diawri for F-point outputs and adding a test case with specific RK3 step.

Documentation updates

Using previous parts, define the main changes to be done in the NEMO literature (manuals, guide, web pages, …).

Need for documentation on qco changes ! Need for documentation on symetric tensor for viscosity operator Need for documentation on e3f computation in ENE and ENS in zps ? + dynspg_ts !

Preview

Since the preview step must be completed before the PI starts the coding, the previewer(s) answers are expected to be completed within the two weeks after the PI has sent the request to the previewer(s).
Then an iterative process should take place between PI and previewer(s) in order to find a consensus

Possible bottlenecks:

  • the methodology
  • the flowchart and list of routines to be changed
  • the new list of variables wrt coding rules
  • the summary of updates in literature

Once an agreement has been reached, preview is ended and the PI can start the development into his branch.

Review

A successful review is needed to schedule the merge of this development into the future NEMO release during next Merge Party (usually in November).

Assessments:

  • Is the proposed methodology now implemented?
  • Are the code changes in agreement with the flowchart defined at preview step?
  • Are the code changes in agreement with list of routines and variables as proposed at preview step?
    If, not, are the discrepancies acceptable?
  • Is the in-line documentation accurate and sufficient?
  • Do the code changes comply with NEMO coding standards?
  • Is the development documented with sufficient details for others to understand the impact of the change?
  • Is the project literature (manual, guide, web, …) now updated or completed following the proposed summary in preview section?

Finding:

Is the review fully successful? If not, please indicate what is still missing


Once review is successful, the development must be scheduled for merge during next Merge Party Meeting.

Last modified 6 months ago Last modified on 2020-11-02T13:25:16+01:00