Opened 7 months ago

Last modified 2 weeks ago

#2366 new Task

HPC-08_epico_Extra_Halo

Reported by: epico Owned by: epico
Priority: low Milestone: IMMERSE 2020
Component: DOM Version: trunk
Severity: minor Keywords: HPC, IMMERSE, 2020WP
Cc: smasson, francesca, hadcv, epico Review: pending
MP ready?: no
Progress: The activity started in 2019 and will be completed in 2020. The management of extra-halo will be completed with the correct management of input/output files

Description (last modified by epico)

The task concerns the development of the extra halo management which aims at using different halo sizes in different kernels in the code

Workplan action

Wikipage: wiki:2020WP/HPC-08_epico_Extra_Halo

Commit History (52)

ChangesetAuthorTimeChangeLog
13327acc2020-07-20T17:33:01+02:00

Final trunk change to obtain tracer.stat consistency and independence from nn_hls. The trunk is still restartable and reproducible with SETTE (ln_icebergs = F) AND run.stat and tracer.stat files match for both nn_hls = 1 and nn_hls = 2. This final piece was found by Seb last week but was missed from the commit. #2366

13305acc2020-07-14T19:12:25+02:00

Trunk changes required to avoid issues with the outer halo in ORCA2_ICE_PISCES, REPRO_8_4 tests with nn_hls=2. These changes ensure that tmask and output from sbc_blk are set correctly in the outer halo. Failure to set valid values in the outer halo can generate Na Ns? which lead to OOB errors in the XRGB lookup table used for the TRC optics.See #2366 for details. With these changes all variants of the ORCA2_ICE_PISCES SETTE test will complete. There are still differences between the 1 and 2 halo width runs but running with: no land suppression; partial land suppression or full land suppression does not alter either set of results. Likewise setting ln_nnogather either true or false does not alter results. Differences in run.stat start after 140 timesteps and differences in tracer.stat start after 60 timesteps between the different halo width sets. Equivalent tests with ln_icebergs = F show no differences in run.stat but halo-width dependent differences in tracer.stat persist (now after 64 timesteps).

13291smasson2020-07-10T14:58:55+02:00

trunk: bugfix in bestpartition, see #2366

13290smasson2020-07-10T12:29:01+02:00

trunk: mpp_nfd_generic: fix sp/dp compilation issue, following Rachid advise, see #2366

13286smasson2020-07-09T17:48:29+02:00

trunk: merge extra halos branch in trunk, see #2366

13275smasson2020-07-08T19:36:41+02:00

Extra_Halo: final bugfix on bestpartition when nn_hls > 1, see #2366

13269smasson2020-07-08T16:27:53+02:00

Extra_Halo: bugfix on bestpartition when nn_hls > 1, see #2366

13256smasson2020-07-07T09:11:15+02:00

Extra_Halo: bugfix in mppini introduced in [12993], see #2366

13252smasson2020-07-06T10:23:31+02:00

Extra_Halo: work with ln_nnogather = F, see #2366

13251smasson2020-07-05T16:59:00+02:00

Extra_Halo: bugfix following merge with trunk@13218, see #2366

13248francesca2020-07-03T20:46:53+02:00

dev_r12558_HPC-08_epico_Extra_Halo: merge with trunk@13237, see #2366

13247francesca2020-07-03T19:15:31+02:00

dev_r12558_HPC-08_epico_Extra_Halo: merge with trunk@13227, see #2366

13238smasson2020-07-03T12:04:50+02:00

Extra_Halo: cosmetic modifications, see #2366

13236smasson2020-07-03T10:54:32+02:00

dev_r12558_HPC-08_epico_Extra_Halo: fix merge with trunk@13218, see #2366

13235francesca2020-07-03T09:18:12+02:00

dev_r12558_HPC-08_epico_Extra_Halo: namelist typo, see #2366

13232smasson2020-07-02T18:38:03+02:00

dev_r12558_HPC-08_epico_Extra_Halo: final-finish merge with trunk@13218, see #2366

13231smasson2020-07-02T18:03:12+02:00

dev_r12558_HPC-08_epico_Extra_Halo: re-finish merge with trunk@13218, see #2366

13230smasson2020-07-02T17:50:26+02:00

dev_r12558_HPC-08_epico_Extra_Halo: finish merge with trunk@13218, see #2366

13229francesca2020-07-02T17:33:41+02:00

dev_r12558_HPC-08_epico_Extra_Halo: merge with trunk@13218, see #2366

13186smasson2020-07-01T09:18:17+02:00

Extra_Halo: merge with trunk@13136, see #2366

13176smasson2020-06-29T18:02:13+02:00

Extra_Halo: rewrite prtctl, supress nn_print, see #2366

13174smasson2020-06-29T17:28:55+02:00

Extra_Halo: works if jpni = 1, allows nn_hls >2, remove island in BENCH, see #2366

13138smasson2020-06-22T11:13:03+02:00

Extra_Halo: minor bugfixes and cleaning, see #2366

13130smasson2020-06-19T08:18:11+02:00

Extra_Halo: supress halos from outputs and coupling, see #2366

13124smasson2020-06-17T16:46:58+02:00

Extra_Halo: merge with trunk@13115, see #2366

13123smasson2020-06-17T16:24:21+02:00

Extra_Halo: deactivate longitude and latitude check in AGRIF, see #2366

13122smasson2020-06-17T14:29:42+02:00

r12931_sette_ticket2366: update input for SAS, see #2366

13120smasson2020-06-17T12:54:38+02:00

Extra_Halo: update svn:externals to use r12931_sette_ticket2366, see #2366

13119smasson2020-06-17T12:50:09+02:00

r12931_sette_ticket2366: update input tarfiles, see #2366

13118smasson2020-06-17T12:45:13+02:00

create sette branch for #2366

13065smasson2020-06-08T18:11:57+02:00

Extra_Halo: toward AGRIF compatibility, see #2366

13015smasson2020-06-03T10:50:47+02:00

Extra_Halo: merge with trunk@13012, see #2366

12993smasson2020-05-29T17:13:41+02:00

Extra_Halo: works when removing land subdomain, cleaning/rewriting of mpp_nfd_generic.h90, see #2366

12992francesca2020-05-29T16:25:02+02:00

Extra_Halo: BENCH test case with halo 2 - verified version - ticket #2366

12989francesca2020-05-29T11:31:16+02:00

Extra_Halo: developments for running BENCH test case with halo 2 - ticket #2366

12980smasson2020-05-27T15:51:41+02:00

Extra_Halo: merge with trunk@12965, see #2366

12978smasson2020-05-27T14:15:10+02:00

Extra_Halo: minor bugfix, see #2366

12960smasson2020-05-22T09:05:34+02:00

Extra_Halo: additional bugfixes and developments, see #2366

12939smasson2020-05-15T19:41:01+02:00

Extra_Halo: update with trunk@12933, see #2366

12866smasson2020-05-05T08:18:05+02:00

Extra_Halo: using input files without halos, see #2366

12815smasson2020-04-25T11:00:22+02:00

Extra_Halo: minor bugfix following [12807], see #2366

12810francesca2020-04-24T17:09:39+02:00

POINTER removal and replacing of traadv_mus.F90 file with original version - ticket #2366

12807smasson2020-04-23T15:14:45+02:00

Extra_Halo: input file only over inner domain + new variables names, see #2366

12760smasson2020-04-17T08:53:28+02:00

Extra_Halo: update do_loop_substitute for nn_hls=2, see #2366

12745smasson2020-04-14T08:14:07+02:00

Extra_Halo: iom cleaning and fix ICB restartability, see #2366

12739smasson2020-04-11T15:50:50+02:00

Extra_Halo: missing 1 file in [12738], see #2366

12738smasson2020-04-11T15:38:38+02:00

Extra_Halo: iom cleaning/update to work only with unknown, global or local (without halos) domains, see #2366

12719francesca2020-04-08T17:45:31+02:00

extra-halo management with positive arrays indices - ticket #2366

12601francesca2020-03-25T12:51:17+01:00

Add extra-halo support (jperio 5,6) - ticket #2366

12586francesca2020-03-23T13:14:40+01:00

Add extra-halo support (jperio 3,4) - ticket #2366

12560francesca2020-03-16T18:26:01+01:00

Rename HPC-08 branch - ticket #2366

12559francesca2020-03-16T15:30:19+01:00

Create HPC-08 branch - ticket #2366

Attachments (1)

DoMacro_rename.sh (2.4 KB) - added by acc 3 months ago.
Corrected script to replace DO loop macros with Italo's macro function versions.

Download all attachments as: .zip

Change History (83)

comment:1 Changed 7 months ago by epico

  • Component changed from TOP to DOM
  • Description modified (diff)
  • Keywords HPC IMMERSE 2020WP added
  • Progress modified (diff)

comment:2 Changed 5 months ago by francesca

In 12559:

Create HPC-08 branch - ticket #2366

comment:3 Changed 5 months ago by francesca

In 12560:

Rename HPC-08 branch - ticket #2366

comment:4 Changed 5 months ago by francesca

In 12586:

Add extra-halo support (jperio 3,4) - ticket #2366

comment:5 Changed 4 months ago by francesca

In 12601:

Add extra-halo support (jperio 5,6) - ticket #2366

comment:6 Changed 4 months ago by francesca

In 12719:

extra-halo management with positive arrays indices - ticket #2366

comment:7 Changed 4 months ago by smasson

sette tests for this version on X64_IRENE:

  • AGRIF does not compile in this version
  • All tests without AGRIF are working
  • tests with ORCA2 gives results different from the trunk
Current code is : NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r12736  ( last change @ r12719 )

SETTE validation report generated for :

       NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r12719 (last changed revision)

       on X64_IRENE arch file


!!---------------1st pass------------------!!

   !----restart----!
WGYRE_PISCES_ST              run.stat    restartability  passed :  12719
WGYRE_PISCES_ST              tracer.stat restartability  passed :  12719
WORCA2_ICE_PISCES_ST         run.stat    restartability  passed :  12719
WORCA2_ICE_PISCES_ST         tracer.stat restartability  passed :  12719
WORCA2_OFF_PISCES_ST         tracer.stat restartability  passed :  12719
WAMM12_ST                    run.stat    restartability  passed :  12719
WORCA2_SAS_ICE_ST            run.stat    restartability  passed :  12719
WAGRIF_DEMO_ST               directory                  MISSING :  12719
WSPITZ12_ST                  run.stat    restartability  passed :  12719
WISOMIP_ST                   run.stat    restartability  passed :  12719
WOVERFLOW_ST                 run.stat    restartability  passed :  12719
WLOCK_EXCHANGE_ST            run.stat    restartability  passed :  12719
WVORTEX_ST                   directory                  MISSING :  12719
WICE_AGRIF_ST                directory                  MISSING :  12719

   !----repro----!
WGYRE_PISCES_ST              run.stat    reproducibility passed :  12719
WGYRE_PISCES_ST              tracer.stat reproducibility passed :  12719
WORCA2_ICE_PISCES_ST         run.stat    reproducibility passed :  12719
WORCA2_ICE_PISCES_ST         tracer.stat reproducibility passed :  12719
WORCA2_OFF_PISCES_ST         tracer.stat reproducibility passed :  12719
WAMM12_ST                    run.stat    reproducibility passed :  12719
WORCA2_SAS_ICE_ST            run.stat    reproducibility passed :  12719
WORCA2_ICE_OBS_ST            run.stat    reproducibility passed :  12719
WAGRIF_DEMO_ST               directory                  MISSING :  12719
WSPITZ12_ST                  run.stat    reproducibility passed :  12719
WISOMIP_ST                   run.stat    reproducibility passed :  12719
WVORTEX_ST                   directory                  MISSING :  12719
WICE_AGRIF_ST                directory                  MISSING :  12719

   !----agrif check----!
ls: cannot access /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/dev_r12558_HPC-08_epico_Extra_Halo/NEMO_VALIDATION/WAGRIF_DEMO_NOAGRIF_ST/X64_IRENE/12719/: No such file or directory
WAGRIF_DEMO_NOAGRIF_ST      WAGRIF_DEMO_ST               incomplete test

   !----result comparison check----!

check result differences between :
VALID directory : /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/dev_r12558_HPC-08_epico_Extra_Halo/NEMO_VALIDATION at rev 12719
and
REFERENCE directory : /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/trunk/NEMO_VALIDATION at rev 12615

WGYRE_PISCES_ST       run.stat    files are identical
WGYRE_PISCES_ST       tracer.stat files are identical
WORCA2_ICE_PISCES_ST  run.stat    files are DIFFERENT (results are different after  9  time steps)
WORCA2_ICE_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WORCA2_OFF_PISCES_ST  tracer.stat files are identical
WAMM12_ST             run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WORCA2_SAS_ICE_ST     run.stat    files are DIFFERENT (results are different after  1  time steps)
WAGRIF_DEMO_ST               VALID     directory at 12719 is MISSING
WSPITZ12_ST           run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WVORTEX_ST                   VALID     directory at 12719 is MISSING
WICE_AGRIF_ST                VALID     directory at 12719 is MISSING

comment:8 Changed 4 months ago by smasson

In 12738:

Extra_Halo: iom cleaning/update to work only with unknown, global or local (without halos) domains, see #2366

comment:9 Changed 4 months ago by smasson

In 12739:

Extra_Halo: missing 1 file in [12738], see #2366

comment:10 Changed 4 months ago by smasson

In 12745:

Extra_Halo: iom cleaning and fix ICB restartability, see #2366

comment:11 Changed 4 months ago by smasson

[12745] gives exactly the same results as [12719]

Current code is : NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r12745  ( last change @ r12745 )

SETTE validation report generated for :

       NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r12745 (last changed revision)

       on X64_IRENE arch file


!!---------------1st pass------------------!!

   !----restart----!
WGYRE_PISCES_ST              run.stat    restartability  passed :  12745
WGYRE_PISCES_ST              tracer.stat restartability  passed :  12745
WORCA2_ICE_PISCES_ST         run.stat    restartability  passed :  12745
WORCA2_ICE_PISCES_ST         tracer.stat restartability  passed :  12745
WORCA2_OFF_PISCES_ST         tracer.stat restartability  passed :  12745
WAMM12_ST                    run.stat    restartability  passed :  12745
WORCA2_SAS_ICE_ST            run.stat    restartability  passed :  12745
WAGRIF_DEMO_ST               directory                  MISSING :  12745
WSPITZ12_ST                  run.stat    restartability  passed :  12745
WISOMIP_ST                   run.stat    restartability  passed :  12745
WOVERFLOW_ST                 run.stat    restartability  passed :  12745
WLOCK_EXCHANGE_ST            run.stat    restartability  passed :  12745
WVORTEX_ST                   directory                  MISSING :  12745
WICE_AGRIF_ST                directory                  MISSING :  12745

   !----repro----!
WGYRE_PISCES_ST              run.stat    reproducibility passed :  12745
WGYRE_PISCES_ST              tracer.stat reproducibility passed :  12745
WORCA2_ICE_PISCES_ST         run.stat    reproducibility passed :  12745
WORCA2_ICE_PISCES_ST         tracer.stat reproducibility passed :  12745
WORCA2_OFF_PISCES_ST         tracer.stat reproducibility passed :  12745
WAMM12_ST                    run.stat    reproducibility passed :  12745
WORCA2_SAS_ICE_ST            run.stat    reproducibility passed :  12745
WORCA2_ICE_OBS_ST            run.stat    reproducibility passed :  12745
WAGRIF_DEMO_ST               directory                  MISSING :  12745
WSPITZ12_ST                  run.stat    reproducibility passed :  12745
WISOMIP_ST                   run.stat    reproducibility passed :  12745
WVORTEX_ST                   directory                  MISSING :  12745
WICE_AGRIF_ST                directory                  MISSING :  12745

   !----agrif check----!
ls: cannot access /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/dev_r12558_HPC-08_epico_Extra_Halo/NEMO_VALIDATION/WAGRIF_DEMO_NOAGRIF_ST/X64_IRENE/12745/: No such file or directory
WAGRIF_DEMO_NOAGRIF_ST      WAGRIF_DEMO_ST               incomplete test

   !----result comparison check----!

check result differences between :
VALID directory : /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/dev_r12558_HPC-08_epico_Extra_Halo/NEMO_VALIDATION at rev 12745
and
REFERENCE directory : /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/dev_r12558_HPC-08_epico_Extra_Halo/NEMO_VALIDATION at rev 12719

WGYRE_PISCES_ST       run.stat    files are identical
WGYRE_PISCES_ST       tracer.stat files are identical
WORCA2_ICE_PISCES_ST  run.stat    files are identical
WORCA2_ICE_PISCES_ST  tracer.stat files are identical
WORCA2_OFF_PISCES_ST  tracer.stat files are identical
WAMM12_ST             run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WORCA2_SAS_ICE_ST     run.stat    files are identical
WAGRIF_DEMO_ST               REFERENCE directory at 12719 is MISSING
WSPITZ12_ST           run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WVORTEX_ST                   REFERENCE directory at 12719 is MISSING
WICE_AGRIF_ST                REFERENCE directory at 12719 is MISSING

comment:12 Changed 4 months ago by smasson

In 12760:

Extra_Halo: update do_loop_substitute for nn_hls=2, see #2366

comment:13 Changed 4 months ago by smasson

The last commit [12760] is not working with the tracers in ORCA2 configurations (so with PISCES). Ocean dynamics is OK.
I don't know why… this is under investigation…

Current code is : NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r12762  ( last change @ r12760 )

SETTE validation report generated for :

       NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r12760 (last changed revision)

       on X64_IRENE arch file


!!---------------1st pass------------------!!

   !----restart----!
WGYRE_PISCES_ST              run.stat    restartability  passed :  12760
WGYRE_PISCES_ST              tracer.stat restartability  passed :  12760
WORCA2_ICE_PISCES_ST         run.stat    restartability  passed :  12760
WORCA2_ICE_PISCES_ST         tracer.stat    restartability  FAILED :  12760  (results are different after   time steps)
WORCA2_OFF_PISCES_ST         tracer.stat    restartability  FAILED :  12760  (results are different after   time steps)
WAMM12_ST                    run.stat    restartability  passed :  12760
WORCA2_SAS_ICE_ST            run.stat    restartability  passed :  12760
WAGRIF_DEMO_ST               directory                  MISSING :  12760
WSPITZ12_ST                  run.stat    restartability  passed :  12760
WISOMIP_ST                   run.stat    restartability  passed :  12760
WOVERFLOW_ST                 run.stat    restartability  passed :  12760
WLOCK_EXCHANGE_ST            run.stat    restartability  passed :  12760
WVORTEX_ST                   directory                  MISSING :  12760
WICE_AGRIF_ST                directory                  MISSING :  12760

   !----repro----!
WGYRE_PISCES_ST              run.stat    reproducibility passed :  12760
WGYRE_PISCES_ST              tracer.stat reproducibility passed :  12760
WORCA2_ICE_PISCES_ST         run.stat    reproducibility passed :  12760
WORCA2_ICE_PISCES_ST         tracer.stat reproducibility FAILED :  12760  (results are different after   time steps)
WORCA2_OFF_PISCES_ST         tracer.stat reproducibility FAILED :  12760  (results are different after   time steps)
WAMM12_ST                    run.stat    reproducibility passed :  12760
WORCA2_SAS_ICE_ST            run.stat    reproducibility passed :  12760
WORCA2_ICE_OBS_ST            run.stat    reproducibility passed :  12760
WAGRIF_DEMO_ST               directory                  MISSING :  12760
WSPITZ12_ST                  run.stat    reproducibility passed :  12760
WISOMIP_ST                   run.stat    reproducibility passed :  12760
WVORTEX_ST                   directory                  MISSING :  12760
WICE_AGRIF_ST                directory                  MISSING :  12760

   !----agrif check----!
ls: cannot access /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/dev_r12558_HPC-08_epico_Extra_Halo/NEMO_VALIDATION/WAGRIF_DEMO_NOAGRIF_ST/X64_IRENE/12760/: No such file or directory
WAGRIF_DEMO_NOAGRIF_ST      WAGRIF_DEMO_ST               incomplete test

   !----result comparison check----!

check result differences between :
VALID directory : /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/dev_r12558_HPC-08_epico_Extra_Halo/NEMO_VALIDATION at rev 12760
and
REFERENCE directory : /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/dev_r12558_HPC-08_epico_Extra_Halo/NEMO_VALIDATION at rev 12719

WGYRE_PISCES_ST       run.stat    files are identical
WGYRE_PISCES_ST       tracer.stat files are identical
WORCA2_ICE_PISCES_ST  run.stat    files are identical
WORCA2_ICE_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WORCA2_OFF_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WAMM12_ST             run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WORCA2_SAS_ICE_ST     run.stat    files are identical
WAGRIF_DEMO_ST               REFERENCE directory at 12719 is MISSING
WSPITZ12_ST           run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WVORTEX_ST                   REFERENCE directory at 12719 is MISSING
WICE_AGRIF_ST                REFERENCE directory at 12719 is MISSING

comment:14 Changed 4 months ago by smasson

In 12807:

Extra_Halo: input file only over inner domain + new variables names, see #2366

comment:15 Changed 3 months ago by smasson

[12807] includes quite a number of changes…

  • replace all nlci by jpi, same for j
  • replace all nldi by Nis0, same for j. This define the starting position of the inner/effective domain in the MPI sub-domain.
  • replace all nlei by Nie0, same for j. This define the ending position of the inner/effective domain in the MPI sub-domain.
  • introduce Ni_0 = Nie0 - Nis0 + 1, same for j and 1 and 2
  • introduce Ni0glo and Nj0glo. Global domain size without the halos. i.e the size we target for all input/outputs files.
  • Ni0glo and Nj0glo are read in the domcfg files or in the namelist. jpiglo and jpjglo (names to be changed) are next computed as
    jpiglo = Ni0glo + 2 * nn_hls
    jpjglo = Nj0glo + 2 * nn_hls
    
  • input files are now defined only over the inner/effective domain. This is working, at least, for domcfg file. Research of best domain decomposition has been adapted in consequence.
  • minimum size of the domain has been corrected according to nn_hls. Note that with nn_hls = 2, it is 8x8 for the smallest possible domain.
  • in case of closed boundary at the south/west, we force tmask of the first line/colum of the inner domain to bet set to 0 (dommsk).
  • thanks to this new definition of closed boundaries of inner domain, we suppress some tricks that were used for F point in lnb_lnk
  • we also suppress some tricks that were done in inner/effective domain definition in the absence of a neighbor (mppini). All MPI sub-domains now have halos of the same size that are excluded from the inner domain.
  • add kfill optional argument in iom_get, for example, to be used when reading scale factors or depth: kfill = jpfillcopy
  • replace domngb.F90 by domutl.F90. mv dom_uniq from dommsk.F90 to domutls.F90 can be used in domwri and dommsk. + minor bugfix in dom_uniq
  • modify BENCH namelists in order to be able to compare it with the trunk.

Note that new names are not completely fixed and could be modified in a near futur…

This version gives the exact same results (run.stat are identicals) of the trunk@12794 with :

  • BENCH: jperio = 4 or jperio = 6. Note that if you compare BENCH from the trunk with an input size defined to 100x100, in this branche its size definition should exclude halos but include an extra 0 band for the southern boundary. So it should be 98x99
  • BENCH using ORCA2 input files without halos: ORCA_R2_zps_domcfg.nc. To use it, simply modify namcfg in namelist_cfg of BENCH.
    !-----------------------------------------------------------------------
    &namcfg        !   parameters of the configuration                      (default: use namusr_def in namelist_cfg)
    !-----------------------------------------------------------------------
       ln_read_cfg = .true.    !  (=T) read the domain configuration file
          cn_domcfg = "ORCA_R2_zps_domcfg"    ! domain configuration filename
    /
    

comment:16 Changed 3 months ago by francesca

In 12810:

POINTER removal and replacing of traadv_mus.F90 file with original version - ticket #2366

comment:17 Changed 3 months ago by smasson

In 12815:

Extra_Halo: minor bugfix following [12807], see #2366

comment:18 Changed 3 months ago by acc

Attached a perl script which should make replacing the do loop macros with Italo's macro function version's easier. The script takes any number of source files as command line input and edits in place. I.e. only use on files without uncommitted changes (just in case). I was too hasty however and I've attached the wrong version; don't use this just yet it needs a bit more tweaking.

Changed 3 months ago by acc

Corrected script to replace DO loop macros with Italo's macro function versions.

comment:19 Changed 3 months ago by acc

Example of DoMacro_rename.sh script in action:

cp DYN/dynldf_iso.F90 TESTFILES/dynldf_iso.F90
./DoMacro_rename.sh TESTFILES/dynldf_iso.F90
Working on file TESTFILES/dynldf_iso.F90

sdiff -s -w 80  DYN/dynldf_iso.F90 TESTFILES/dynldf_iso.F90
         DO_3D_00_00( 1, jpk )	      |	         DO_3D( 0, 0, 0, 0, 1, jpk )
            DO_2D_00_01		      |	            DO_2D( 0, 0, 0, 1 )
            DO_2D_00_01		      |	            DO_2D( 0, 0, 0, 1 )
         DO_2D_10_10		      |	         DO_2D( 1, 0, 1, 0 )
         DO_2D_00_10		      |	         DO_2D( 0, 0, 1, 0 )
            DO_2D_01_10		      |	            DO_2D( 0, 1, 1, 0 )
            DO_2D_01_10		      |	            DO_2D( 0, 1, 1, 0 )
         DO_2D_00_00		      |	         DO_2D( 0, 0, 0, 0 )

The script should find and replace all occurrences of the original DO loop macros. It won't recognise or treat any of the 'lnxt' versions that had been introduced for the extended haloes.

comment:20 Changed 3 months ago by smasson

In 12866:

Extra_Halo: using input files without halos, see #2366

comment:21 Changed 3 months ago by smasson

  • Cc smasson added

comment:22 Changed 3 months ago by smasson

[12866] has still several problems that mut be solved.

  • we tested only GYRE_PISCES, ORCA2_ICE_PISCES, AMM12, SPITZ12. They pass the sette tests but results of ORCA2_ICE_PISCES and SPITZ12 are different from the trunk@12790. We have to find why…
  • ORCA2_OFF_PISCES is not working
  • I did not tests AGRIF configurations (but I would like to do it once we update the branch with the trunk which contains AGRIF bugfix)
  • tests cases where not tested

[12866] requires new input files.

There is the link for the new input files I created. This files may be modified in the future. They are not yet the official new set of input files.

  • ORCA2_ICE_v4.x.tar.gz

https://owncloud.locean-ipsl.upmc.fr/index.php/s/rIYPabNR7m74J0T

  • ORCA2_OFF_v4.x.tar.gz

https://owncloud.locean-ipsl.upmc.fr/index.php/s/xAfQQNjxMYGwF3q

  • AMM12_v4.x.tar.gz:

https://owncloud.locean-ipsl.upmc.fr/index.php/s/JzodBZa7VeczUrT

  • SPITZ12_v4.x.tar.gz

https://owncloud.locean-ipsl.upmc.fr/index.php/s/ETx0UyGiyiomyU3

To use these files with sette, you must untar them (tar xbfz xxx.tar.gz) and modify the following files

-bash-4.2$ svn diff sette/input_*
Index: sette/input_AMM12.cfg
===================================================================
--- sette/input_AMM12.cfg       (revision 12866)
+++ sette/input_AMM12.cfg       (working copy)
@@ -1 +1 @@
-AMM12_v4.0.tar AMM12_v4.0
+AMM12_v4.0.tar AMM12_v4.x
Index: sette/input_ORCA2_ICE_PISCES.cfg
===================================================================
--- sette/input_ORCA2_ICE_PISCES.cfg    (revision 12866)
+++ sette/input_ORCA2_ICE_PISCES.cfg    (working copy)
@@ -1 +1 @@
-ORCA2_ICE_v4.0.tar  ORCA2_ICE_v4.0
+ORCA2_ICE_v4.0.tar  ORCA2_ICE_v4.x
Index: sette/input_ORCA2_OFF_PISCES.cfg
===================================================================
--- sette/input_ORCA2_OFF_PISCES.cfg    (revision 12866)
+++ sette/input_ORCA2_OFF_PISCES.cfg    (working copy)
@@ -1 +1 @@
-ORCA2_OFF_v4.0.tar ORCA2_OFF_v4.0
+ORCA2_OFF_v4.0.tar ORCA2_OFF_v4.x
Index: sette/input_SPITZ12.cfg
===================================================================
--- sette/input_SPITZ12.cfg     (revision 12866)
+++ sette/input_SPITZ12.cfg     (working copy)
@@ -1 +1 @@
-SPITZ12_v4.0.tar SPITZ12_v4.0
+SPITZ12_v4.0.tar SPITZ12_v4.x

comment:23 Changed 3 months ago by smasson

In 12939:

Extra_Halo: update with trunk@12933, see #2366

comment:24 Changed 3 months ago by smasson

In 12960:

Extra_Halo: additional bugfixes and developments, see #2366

comment:25 Changed 3 months ago by smasson

[12960] pass sette testes for all configurations without AGRIF: GYRE_PISCES, ORCA2_ICE_PISCES, ORCA2_OFF_PISCES, AMM12, ORCA2_SAS_ICE, SPITZ12, ISOMIP, OVERFLOW, LOCK_EXCHANGE.

  • Results are identical for trunk@12925 only for : GYRE_PISCES, AMM12, ORCA2_SAS_ICE,
  • Results of ORCA2_ICE_PISCES are different because of ln_use_calving = .true.. I don't have the final explication yet.
  • Results of SPITZ12 are different because of the metric terms in the flux form formulation of the momentum advection. The problem comes from the variables di_e2v_2e1e2f and dj_e1u_2e1e2f, which uses scale factor values in land on jpi and jpj. Because of this we must extend the initial input condition by 1 point along i and j, so that the last column and the last row are land → going back to the original input files.
  • Results of ORCA2_ICE_PISCES for tracers are different even if we manage to get the same results for the dynamics. I am currently investigating this problem.
  • I don't know yet why results for ISOMIP are different

comment:26 Changed 2 months ago by francesca

In 12989:

Extra_Halo: developments for running BENCH test case with halo 2 - ticket #2366

comment:27 Changed 2 months ago by francesca

In 12992:

Extra_Halo: BENCH test case with halo 2 - verified version - ticket #2366

comment:28 Changed 2 months ago by smasson

In 12993:

Extra_Halo: works when removing land subdomain, cleaning/rewriting of mpp_nfd_generic.h90, see #2366

comment:29 Changed 2 months ago by smasson

  • Cc francesca hadcv added

in [12993], the "l_north_nogather" part of mpp_nfd_generic.h90 has been rewritten is, hopefully, more clear and more efficient (proper loop order, proper treatment of suppressed land subdomains…).
I still don't really get how ztabr works… maybe we could do something to reduce its size…

Next step should be a complete review of src/OCE/LBC/lbc_nfd* routines…

  • I suspect we can slightly optimize the no-gather case setup (include or not halos when looking at neighbors, really adapt to the subtile differences between jperio=4 or 6, play with jpi definition?), make it easier to read and maybe more flexible (if we want to have jpi very different from on subdomain to the other). Once no-gather case is really clean with more comments, we could maybe remove the ln_nnogather parameter (always set to .true.).
  • We should use only inner-domain values to fill the halos (and not assuming that the e-w communication has already been done). This is cleaner and offer more flexibility in lnc_lnk.
  • I suspect some errors on the point located around the north pole but we don't see them as they are masked (this could explain the small island I was forced to add in BENCH)…
  • In several parts of lbc_nfd*, there is implicit loops (like ARRAY_IN(jpi-ii+1,jpj,:,:,jf) ) which create a wrong loop order once compiled. I think I may had introduce this error…

comment:30 Changed 2 months ago by francesca

  • Cc epico added

comment:31 Changed 2 months ago by smasson

In 13015:

Extra_Halo: merge with trunk@13012, see #2366

comment:32 Changed 2 months ago by smasson

In 13065:

Extra_Halo: toward AGRIF compatibility, see #2366

comment:33 Changed 2 months ago by smasson

From this version, we have been forced to change our strategy regarding the closed bondaries at the eastern and northern sides.
Because of these 2 lines in dynvor:

   di_e2v_2e1e2f(ji,jj) = ( e2v(ji+1,jj  ) - e2v(ji,jj) )  * 0.5 * r1_e1e2f(ji,jj)
   dj_e1u_2e1e2f(ji,jj) = ( e1u(ji  ,jj+1) - e1u(ji,jj) )  * 0.5 * r1_e1e2f(ji,jj)

We need to keep a land point on the eastern and northern sides of the input domain.
⇒ We keep a land points in the input domain as soon as the boundary to closed (no periodic)

In consequence, only the input files related to ORCA grid need to be changed.
ORCA2_ICE_v4.x.tar.gz
ORCA2_OFF_v4.x.tar.gz
SAS_v4.x.tar
AGRIF_DEMO_v4.x.tar

Only AGRIF_DEMO_ST is not yet working with this new version of the code
All other configurations are passing the sette tests.
Results are identical with trunk@12925 except for ORCA2_ICE_PISCES (because of ln_use_calving), ORCA2_OFF_PISCES (?), AGRIF_DEMO (that is not working) and ICE_AGRIF (?).
These last points remain to be fixed…

Current code is : NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r13064  ( last change @ r13015 )

SETTE validation report generated for :

       NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r13015+ (last changed revision)

       on X64_JEANZAY arch file


!!---------------1st pass------------------!!

   !----restart----!
WGYRE_PISCES_ST              run.stat    restartability  passed :  13015+
WGYRE_PISCES_ST              tracer.stat restartability  passed :  13015+
WORCA2_ICE_PISCES_ST         run.stat    restartability  passed :  13015+
WORCA2_ICE_PISCES_ST         tracer.stat restartability  passed :  13015+
WORCA2_OFF_PISCES_ST         tracer.stat restartability  passed :  13015+
WAMM12_ST                    run.stat    restartability  passed :  13015+
WORCA2_SAS_ICE_ST            run.stat    restartability  passed :  13015+
WAGRIF_DEMO_ST               ocean.output               MISSING :  13015+
WAGRIF_DEMO_ST               incomplete test
WSPITZ12_ST                  run.stat    restartability  passed :  13015+
WISOMIP_ST                   run.stat    restartability  passed :  13015+
WOVERFLOW_ST                 run.stat    restartability  passed :  13015+
WLOCK_EXCHANGE_ST            run.stat    restartability  passed :  13015+
WVORTEX_ST                   run.stat    restartability  passed :  13015+
WICE_AGRIF_ST                run.stat    restartability  passed :  13015+

   !----repro----!
WGYRE_PISCES_ST              run.stat    reproducibility passed :  13015+
WGYRE_PISCES_ST              tracer.stat reproducibility passed :  13015+
WORCA2_ICE_PISCES_ST         run.stat    reproducibility passed :  13015+
WORCA2_ICE_PISCES_ST         tracer.stat reproducibility passed :  13015+
WORCA2_OFF_PISCES_ST         tracer.stat reproducibility passed :  13015+
WAMM12_ST                    run.stat    reproducibility passed :  13015+
WORCA2_SAS_ICE_ST            run.stat    reproducibility passed :  13015+
WORCA2_ICE_OBS_ST            run.stat    reproducibility passed :  13015+
WAGRIF_DEMO_ST               ocean.output               MISSING :  13015+
WAGRIF_DEMO_ST               incomplete test
WSPITZ12_ST                  run.stat    reproducibility passed :  13015+
WISOMIP_ST                   run.stat    reproducibility passed :  13015+
WVORTEX_ST                   run.stat    reproducibility passed :  13015+
WICE_AGRIF_ST                run.stat    reproducibility passed :  13015+

   !----agrif check----!
ORCA2 AGRIF vs ORCA2 NOAGRIF run.stat    unchanged  -    passed :  13015+ 13015+

   !----result comparison check----!

check result differences between :
VALID directory : /gpfsscratch/rech/fqx/reee217/dev_r12558_HPC-08_epico_Extra_Halo/NEMO_VALIDATION at rev 13015+
and
REFERENCE directory : /gpfswork/rech/fqx/reee217/NEMO_ALL_VALIDATIONS/trunk/NEMO_VALIDATION at rev 12925

WGYRE_PISCES_ST       run.stat    files are identical
WGYRE_PISCES_ST       tracer.stat files are identical
WORCA2_ICE_PISCES_ST  run.stat    files are DIFFERENT (results are different after  42  time steps)
WORCA2_ICE_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WORCA2_OFF_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WAMM12_ST             run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WORCA2_SAS_ICE_ST     run.stat    files are identical
WAGRIF_DEMO_ST        incomplete test
WSPITZ12_ST           run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WVORTEX_ST            run.stat    files are identical
WICE_AGRIF_ST         run.stat    files are DIFFERENT (results are different after  2  time steps)


comment:34 Changed 2 months ago by smasson

I forgot to mention that nn_hls is now a namelist parameter in the nammpp block

comment:35 Changed 7 weeks ago by smasson

In 13118:

create sette branch for #2366

comment:36 Changed 7 weeks ago by smasson

In 13119:

r12931_sette_ticket2366: update input tarfiles, see #2366

comment:37 Changed 7 weeks ago by smasson

In 13120:

Extra_Halo: update svn:externals to use r12931_sette_ticket2366, see #2366

comment:38 Changed 7 weeks ago by smasson

In 13122:

r12931_sette_ticket2366: update input for SAS, see #2366

comment:39 Changed 7 weeks ago by smasson

In 13123:

Extra_Halo: deactivate longitude and latitude check in AGRIF, see #2366

comment:40 Changed 7 weeks ago by smasson

Need to deactivate lon/lat check to pass sette tests for AGRIF_DEMO… :-(
Same story for the trunk… :-( :-(
Clearly 2_ORCA_R05_zps_domcfg_agrif.nc and 3_ORCA_R017_zps_domcfg_agrif.nc where not properly built and do not match their mother grid…
⇒ at some point we should redefine these grids and reactivate/add the check of glamt, gphit but also e1t and e2t in agrif_user.F…

[13123] pass all sette tests and gives same results as trunk@12925 except for ORCA2_ICE_PISCES because of ln_use_calving = .true. and for ORCA2_ICE_PISCES tracers that are different even if we manage to get the same results for the dynamics…

comment:42 Changed 7 weeks ago by smasson

In 13124:

Extra_Halo: merge with trunk@13115, see #2366

comment:43 Changed 7 weeks ago by smasson

with nn_hls = 1 :
[13124] passes all sette tests.
Results are identical to trunk@12925, except for

  • ORCA2_ICE_PISCES because of ln_use_calving = .true.
  • ORCA2_ICE_PISCES tracers that are different even if we manage to get the same results for the dynamics…
Current code is : NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r13126  ( last change @ r13124 )

SETTE validation report generated for :

       NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r13124 (last changed revision)

       on X64_IRENE arch file


!!---------------1st pass------------------!!

   !----restart----!
WGYRE_PISCES_ST              run.stat    restartability  passed :  13124
WGYRE_PISCES_ST              tracer.stat restartability  passed :  13124
WORCA2_ICE_PISCES_ST         run.stat    restartability  passed :  13124
WORCA2_ICE_PISCES_ST         tracer.stat restartability  passed :  13124
WORCA2_OFF_PISCES_ST         tracer.stat restartability  passed :  13124
WAMM12_ST                    run.stat    restartability  passed :  13124
WORCA2_SAS_ICE_ST            run.stat    restartability  passed :  13124
WAGRIF_DEMO_ST               run.stat    restartability  passed :  13124
WSPITZ12_ST                  run.stat    restartability  passed :  13124
WISOMIP_ST                   run.stat    restartability  passed :  13124
WOVERFLOW_ST                 run.stat    restartability  passed :  13124
WLOCK_EXCHANGE_ST            run.stat    restartability  passed :  13124
WVORTEX_ST                   run.stat    restartability  passed :  13124
WICE_AGRIF_ST                run.stat    restartability  passed :  13124

   !----repro----!
WGYRE_PISCES_ST              run.stat    reproducibility passed :  13124
WGYRE_PISCES_ST              tracer.stat reproducibility passed :  13124
WORCA2_ICE_PISCES_ST         run.stat    reproducibility passed :  13124
WORCA2_ICE_PISCES_ST         tracer.stat reproducibility passed :  13124
WORCA2_OFF_PISCES_ST         tracer.stat reproducibility passed :  13124
WAMM12_ST                    run.stat    reproducibility passed :  13124
WORCA2_SAS_ICE_ST            run.stat    reproducibility passed :  13124
WORCA2_ICE_OBS_ST            run.stat    reproducibility passed :  13124
WAGRIF_DEMO_ST               run.stat    reproducibility passed :  13124
WSPITZ12_ST                  run.stat    reproducibility passed :  13124
WISOMIP_ST                   run.stat    reproducibility passed :  13124
WVORTEX_ST                   run.stat    reproducibility passed :  13124
WICE_AGRIF_ST                run.stat    reproducibility passed :  13124

   !----agrif check----!
ORCA2 AGRIF vs ORCA2 NOAGRIF run.stat    unchanged  -    passed :  13124 13124

   !----result comparison check----!

check result differences between :
VALID directory : /ccc/scratch/cont005/ra0542/massons/valid_dev_r12558_HPC-08_epico_Extra_Halo/sette at rev 13124
and
REFERENCE directory : /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/trunk/NEMO_VALIDATION at rev 12925

WGYRE_PISCES_ST       run.stat    files are identical
WGYRE_PISCES_ST       tracer.stat files are identical
WORCA2_ICE_PISCES_ST  run.stat    files are DIFFERENT (results are different after  201  time steps)
WORCA2_ICE_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WORCA2_OFF_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WAMM12_ST             run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WORCA2_SAS_ICE_ST     run.stat    files are identical
WAGRIF_DEMO_ST        run.stat    files are identical
WSPITZ12_ST           run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WVORTEX_ST            run.stat    files are identical
WICE_AGRIF_ST         run.stat    files are identical
Last edited 7 weeks ago by smasson (previous) (diff)

comment:44 Changed 7 weeks ago by smasson

In 13130:

Extra_Halo: supress halos from outputs and coupling, see #2366

comment:45 Changed 6 weeks ago by smasson

In 13138:

Extra_Halo: minor bugfixes and cleaning, see #2366

comment:46 Changed 5 weeks ago by smasson

In 13174:

Extra_Halo: works if jpni = 1, allows nn_hls >2, remove island in BENCH, see #2366

comment:47 Changed 5 weeks ago by smasson

In 13176:

Extra_Halo: rewrite prtctl, supress nn_print, see #2366

comment:48 Changed 5 weeks ago by smasson

In 13186:

Extra_Halo: merge with trunk@13136, see #2366

comment:49 Changed 5 weeks ago by francesca

In 13229:

dev_r12558_HPC-08_epico_Extra_Halo: merge with trunk@13218, see #2366

comment:50 Changed 5 weeks ago by smasson

In 13230:

dev_r12558_HPC-08_epico_Extra_Halo: finish merge with trunk@13218, see #2366

comment:51 Changed 5 weeks ago by smasson

In 13231:

dev_r12558_HPC-08_epico_Extra_Halo: re-finish merge with trunk@13218, see #2366

comment:52 Changed 5 weeks ago by smasson

In 13232:

dev_r12558_HPC-08_epico_Extra_Halo: final-finish merge with trunk@13218, see #2366

comment:53 Changed 5 weeks ago by francesca

In 13235:

dev_r12558_HPC-08_epico_Extra_Halo: namelist typo, see #2366

comment:54 Changed 5 weeks ago by smasson

In 13236:

dev_r12558_HPC-08_epico_Extra_Halo: fix merge with trunk@13218, see #2366

comment:55 Changed 5 weeks ago by smasson

In 13238:

Extra_Halo: cosmetic modifications, see #2366

comment:56 Changed 5 weeks ago by francesca

In 13247:

dev_r12558_HPC-08_epico_Extra_Halo: merge with trunk@13227, see #2366

comment:57 Changed 5 weeks ago by francesca

In 13248:

dev_r12558_HPC-08_epico_Extra_Halo: merge with trunk@13237, see #2366

comment:58 Changed 5 weeks ago by smasson

In 13251:

Extra_Halo: bugfix following merge with trunk@13218, see #2366

comment:59 Changed 5 weeks ago by smasson

[13251] passes all sette tests. Same results as trunk@13218 except for ORCA2_ICE_PISCES because of icebergs. We also have differences tracer.stat in ORCA2_OFF_PISCES

[reee217@jean-zay3: sette]$ ./sette_rpt.sh

Current code is : NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r13251  ( last change @ r13251 )

SETTE validation report generated for :

       NEMO/branches/2020/dev_r12558_HPC-08_epico_Extra_Halo @ r13251 (last changed revision)

       on X64_JEANZAY arch file


!!---------------1st pass------------------!!

   !----restart----!
WGYRE_PISCES_ST              run.stat    restartability  passed :  13251
WGYRE_PISCES_ST              tracer.stat restartability  passed :  13251
WORCA2_ICE_PISCES_ST         run.stat    restartability  passed :  13251
WORCA2_ICE_PISCES_ST         tracer.stat restartability  passed :  13251
WORCA2_OFF_PISCES_ST         tracer.stat restartability  passed :  13251
WAMM12_ST                    run.stat    restartability  passed :  13251
WORCA2_SAS_ICE_ST            run.stat    restartability  passed :  13251
WAGRIF_DEMO_ST               run.stat    restartability  passed :  13251
WSPITZ12_ST                  run.stat    restartability  passed :  13251
WISOMIP_ST                   run.stat    restartability  passed :  13251
WOVERFLOW_ST                 run.stat    restartability  passed :  13251
WLOCK_EXCHANGE_ST            run.stat    restartability  passed :  13251
WVORTEX_ST                   run.stat    restartability  passed :  13251
WICE_AGRIF_ST                run.stat    restartability  passed :  13251

   !----repro----!
WGYRE_PISCES_ST              run.stat    reproducibility passed :  13251
WGYRE_PISCES_ST              tracer.stat reproducibility passed :  13251
WORCA2_ICE_PISCES_ST         run.stat    reproducibility passed :  13251
WORCA2_ICE_PISCES_ST         tracer.stat reproducibility passed :  13251
WORCA2_OFF_PISCES_ST         tracer.stat reproducibility passed :  13251
WAMM12_ST                    run.stat    reproducibility passed :  13251
WORCA2_SAS_ICE_ST            run.stat    reproducibility passed :  13251
WORCA2_ICE_OBS_ST            run.stat    reproducibility passed :  13251
WAGRIF_DEMO_ST               run.stat    reproducibility passed :  13251
WSPITZ12_ST                  run.stat    reproducibility passed :  13251
WISOMIP_ST                   run.stat    reproducibility passed :  13251
WVORTEX_ST                   run.stat    reproducibility passed :  13251
WICE_AGRIF_ST                run.stat    reproducibility passed :  13251

   !----agrif check----!
ORCA2 AGRIF vs ORCA2 NOAGRIF run.stat    unchanged  -    passed :  13251 13251

   !----result comparison check----!

check result differences between :
VALID directory : /gpfsscratch/rech/fqx/reee217/extra_new/NEMO_VALIDATION at rev 13251
and
REFERENCE directory : /gpfswork/rech/fqx/reee217/NEMO_ALL_VALIDATIONS/trunk/NEMO_VALIDATION at rev 13218

WGYRE_PISCES_ST       run.stat    files are identical
WGYRE_PISCES_ST       tracer.stat files are identical
WORCA2_ICE_PISCES_ST  run.stat    files are DIFFERENT (results are different after  42  time steps)
WORCA2_ICE_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WORCA2_OFF_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WAMM12_ST             run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WORCA2_SAS_ICE_ST     run.stat    files are identical
WAGRIF_DEMO_ST        run.stat    files are identical
WSPITZ12_ST           run.stat    files are identical
WISOMIP_ST            run.stat    files are identical
WVORTEX_ST            run.stat    files are identical
WICE_AGRIF_ST         run.stat    files are identical

comment:60 Changed 4 weeks ago by smasson

In 13252:

Extra_Halo: work with ln_nnogather = F, see #2366

comment:61 Changed 4 weeks ago by smasson

In 13256:

Extra_Halo: bugfix in mppini introduced in [12993], see #2366

comment:62 Changed 4 weeks ago by smasson

In 13269:

Extra_Halo: bugfix on bestpartition when nn_hls > 1, see #2366

comment:63 Changed 4 weeks ago by smasson

In 13275:

Extra_Halo: final bugfix on bestpartition when nn_hls > 1, see #2366

comment:64 Changed 4 weeks ago by smasson

there is where we are with [13275]:

with nn_hls = 1

all sette tests ok. Differences with trunk@13218

WORCA2_ICE_PISCES_ST  run.stat    files are DIFFERENT (results are different after  201  time steps)
WORCA2_ICE_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WORCA2_OFF_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)

with nn_hls = 2

sette test fails for

   !----restart----!
WAGRIF_DEMO_ST               run.stat    restartability  FAILED :  13275  (results are different after  19     time steps)
WOVERFLOW_ST                 ocean.output               MISSING :  13275
WOVERFLOW_ST                 incomplete test
WLOCK_EXCHANGE_ST            ocean.output               MISSING :  13275
WLOCK_EXCHANGE_ST            incomplete test

   !----repro----!
WORCA2_ICE_PISCES_ST         tracer.stat reproducibility FAILED :  13275  (results are different after   time steps)
WORCA2_OFF_PISCES_ST         tracer.stat reproducibility FAILED :  13275  (results are different after   time steps)
WAGRIF_DEMO_ST               run.stat    reproducibility FAILED :  13275  (results are different after  4      time steps)

results are different from nn_hls = 1 for

WORCA2_ICE_PISCES_ST  run.stat    files are DIFFERENT (results are different after  16  time steps)
WORCA2_ICE_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WORCA2_OFF_PISCES_ST  tracer.stat files are DIFFERENT (results are different after   time steps)
WAGRIF_DEMO_ST        run.stat    files are DIFFERENT (results are different after  17  time steps)

without icebergs in ORCA2_ICE_PICES

jpni x jpnj jpnij nn_hls nogather test
8 x 4 32 1 T ok run.stat same as LONG
8 x 4 32 1 F ok run.stat same as LONG
8 x 4 32 2 T error (Segmentation fault) at time.step 95 in p4zopt_mp_p4z_opt
8 x 4 32 2 F error (Segmentation fault) at time.step 95 in p4zopt_mp_p4z_opt
4 x 8 32 2 T ok run.stat same as LONG
4 x 8 32 2 F ok run.stat same as LONG
8 x 4 31 1 T ok run.stat same as LONG
8 x 4 31 1 F ok run.stat same as LONG
8 x 4 31 2 T ok run.stat same as LONG
8 x 4 31 2 F ok run.stat same as LONG


Last edited 4 weeks ago by smasson (previous) (diff)

comment:65 Changed 4 weeks ago by rblod

Not related on the reproducibility with the trunk, the following syntax in mpp_nfd_generic.h90:

ztabb(ji,ij1,jk,jl) = HUGE(0.wp)
znorthloc(ji,jj,jk,jl,jf) = HUGE(0.)


doesn't work (for instance with gfortran). Due to the merge with single precision it generates for the single precision routine a conversion between R4 and R8 which is not possible since it's actually huge ..
a fix could be to define at the HEAD of the routine in the precision related macros something like:

#    define ERRVAL(x)   HUGE(x/**/_sp)

(same for double precision) and use

ztabb(ji,ij1,jk,jl) = ERRVAL(0.)

comment:66 Changed 4 weeks ago by smasson

great can you commit it?

comment:67 Changed 4 weeks ago by rblod

It's a configuration unlikely to happen, except for debugging, when compiling with MPI and using only one CPU, best_partition fails (not sure but I think isz0 is 0). It could be cleaner to bypass those calls in the case mppsize=1

comment:68 Changed 4 weeks ago by smasson

It should not be the case… It was working before…

comment:69 Changed 4 weeks ago by smasson

In 13286:

trunk: merge extra halos branch in trunk, see #2366

comment:70 Changed 4 weeks ago by smasson

In 13290:

trunk: mpp_nfd_generic: fix sp/dp compilation issue, following Rachid advise, see #2366

comment:71 Changed 4 weeks ago by smasson

In 13291:

trunk: bugfix in bestpartition, see #2366

comment:72 Changed 4 weeks ago by smasson

bestpartition working again even with 1 core

comment:73 Changed 4 weeks ago by smasson

In 13292:

update sette with sette/r12931_sette_ticket2366

comment:74 Changed 3 weeks ago by acc

[13295] in the trunk replaced the do-loop macros. SETTE results are identical to those with the previous revision but there are still issues with some cases. One such case is ORCA2_ICE_PISCES, REPRO_8_4 with nn_hls=2, 30 processors and ln_icebergs=F. tracer.stat gives NaNs after 4 time steps but run.stat is ok. There are no NaNs in any restart files (physics and trc) after 8 time steps. The NaNs are coming from the volume used to weight the tracers (cvol) because e3t(jpi,:,:,Kmm) contains NaNs (but only this column and only in procs 11, 25, 26, 29, 28, 30). Investigations continue but I've noticed that tmask is zero on the outer halo; is this intentional? It is because of:

!
      tmask(:,:,:) = 0._wp
      DO_2D( 1, 1, 1, 1 )
         iktop = k_top(ji,jj)
         ikbot = k_bot(ji,jj)
         IF( iktop /= 0 ) THEN       ! water in the column
            tmask(ji,jj,iktop:ikbot) = 1._wp
         ENDIF
      END_2D

in dommsk.F90; should this now be:

      tmask(:,:,:) = 0._wp
      DO_2D( nn_hls, nn_hls, nn_hls, nn_hls )
         iktop = k_top(ji,jj)
         ikbot = k_bot(ji,jj)
         IF( iktop /= 0 ) THEN       ! water in the column
            tmask(ji,jj,iktop:ikbot) = 1._wp
         ENDIF
      END_2D

?

comment:75 Changed 3 weeks ago by smasson

Yes I agree with you Andrew, it should be DO_2D( nn_hls, nn_hls, nn_hls, nn_hls ). tmask must be computed everywhere based on k_top, k_bot.

Did you keep the modifications in sum3x3?

comment:76 Changed 3 weeks ago by acc

Yes I kept the 3x3 modifications; that looks to be ok. The NaNs appear to be coming from the emp field. These corrupt e3t via ssh_nxt but only in that outer halo column and don't impact the physics. Can't be certain though since inserting print statements seems to change behaviour :(

comment:77 Changed 3 weeks ago by acc

I think I've found the source of NaNs in the 30 proc, ORCA2_ICE_PISCES, REPRO_8_4, nn_hls=2 case. The corruption to emp comes from the evaporation part and it is because the bulk formulae are not applied consistently to the outer halo. The issue is similar to the tmask situation mentioned above. sbcblk.F90 and sbcblk_phy.F90 contain a lot of DO_2D( 1, 1, 1, 1) loops that need to be DO_2D( nn_hls, nn_hls, nn_hls, nn_hls ) to retain the same coverage across the whole processor domain. Changing all of them eliminates the NaNs. It will need more analysis to work out which are critical.

comment:78 Changed 3 weeks ago by acc

In 13305:

Trunk changes required to avoid issues with the outer halo in ORCA2_ICE_PISCES,
REPRO_8_4 tests with nn_hls=2. These changes ensure that tmask and output
from sbc_blk are set correctly in the outer halo. Failure to set valid
values in the outer halo can generate Na Ns? which lead to OOB errors in the
XRGB lookup table used for the TRC optics.See #2366 for details. With these
changes all variants of the ORCA2_ICE_PISCES SETTE test will complete. There
are still differences between the 1 and 2 halo width runs but running with:
no land suppression; partial land suppression or full land suppression does
not alter either set of results. Likewise setting ln_nnogather either true
or false does not alter results. Differences in run.stat start after 140
timesteps and differences in tracer.stat start after 60 timesteps between
the different halo width sets. Equivalent tests with ln_icebergs = F show no
differences in run.stat but halo-width dependent differences in tracer.stat
persist (now after 64 timesteps).

comment:79 Changed 3 weeks ago by acc

A table of the results of tests carried out with variants of ORCA2_ICE_PISCES, REPRO_8_4, SETTE test leading to [13305]:

chksum ln_icebergs file nproc nn_hls  ln_nnogather
            |         |     | |        |
39530   WITH_BERGS_runstat_30_2_nnogat_F.stat-+
39530   WITH_BERGS_runstat_31_2_nnogat_F.stat |
39530   WITH_BERGS_runstat_32_2_nnogat_F.stat |
39530   WITH_BERGS_runstat_30_2_nnogat_T.stat |
39530   WITH_BERGS_runstat_31_2_nnogat_T.stat |
39530   WITH_BERGS_runstat_32_2_nnogat_T.stat |
                                               > differences after 140 timesteps
54427   WITH_BERGS_runstat_30_1_nnogat_F.stat |
54427   WITH_BERGS_runstat_31_1_nnogat_F.stat |
54427   WITH_BERGS_runstat_32_1_nnogat_F.stat |
54427   WITH_BERGS_runstat_30_1_nnogat_T.stat |
54427   WITH_BERGS_runstat_31_1_nnogat_T.stat |
54427   WITH_BERGS_runstat_32_1_nnogat_T.stat-+

16750   WITH_BERGS_tracer__30_1_nnogat_F.stat-+
16750   WITH_BERGS_tracer__31_1_nnogat_F.stat |
16750   WITH_BERGS_tracer__32_1_nnogat_F.stat |
16750   WITH_BERGS_tracer__30_1_nnogat_T.stat |
16750   WITH_BERGS_tracer__31_1_nnogat_T.stat |
16750   WITH_BERGS_tracer__32_1_nnogat_T.stat |
                                               > differences after 60 timesteps
48497   WITH_BERGS_tracer__30_2_nnogat_F.stat |
48497   WITH_BERGS_tracer__31_2_nnogat_F.stat |
48497   WITH_BERGS_tracer__32_2_nnogat_F.stat |
48497   WITH_BERGS_tracer__30_2_nnogat_T.stat |
48497   WITH_BERGS_tracer__31_2_nnogat_T.stat |
48497   WITH_BERGS_tracer__32_2_nnogat_T.stat-+

58089     NO_BERGS_runstat_30_1_nnogat_F.stat-+
58089     NO_BERGS_runstat_31_1_nnogat_F.stat |
58089     NO_BERGS_runstat_32_1_nnogat_F.stat |
58089     NO_BERGS_runstat_30_1_nnogat_T.stat |
58089     NO_BERGS_runstat_31_1_nnogat_T.stat |
58089     NO_BERGS_runstat_32_1_nnogat_T.stat |
                                               > no differences
58089     NO_BERGS_runstat_30_2_nnogat_F.stat |
58089     NO_BERGS_runstat_31_2_nnogat_F.stat |
58089     NO_BERGS_runstat_32_2_nnogat_F.stat |
58089     NO_BERGS_runstat_30_2_nnogat_T.stat |
58089     NO_BERGS_runstat_31_2_nnogat_T.stat |
58089     NO_BERGS_runstat_32_2_nnogat_T.stat-+

48116     NO_BERGS_tracer__30_1_nnogat_F.stat-+
48116     NO_BERGS_tracer__31_1_nnogat_F.stat |
48116     NO_BERGS_tracer__32_1_nnogat_F.stat |
48116     NO_BERGS_tracer__30_1_nnogat_T.stat |
48116     NO_BERGS_tracer__31_1_nnogat_T.stat |
48116     NO_BERGS_tracer__32_1_nnogat_T.stat |
                                               > still halo-dependent differences after 64 ts
07109     NO_BERGS_tracer__30_2_nnogat_F.stat |
07109     NO_BERGS_tracer__31_2_nnogat_F.stat |
07109     NO_BERGS_tracer__32_2_nnogat_F.stat |
07109     NO_BERGS_tracer__30_2_nnogat_T.stat |
07109     NO_BERGS_tracer__31_2_nnogat_T.stat |
07109     NO_BERGS_tracer__32_2_nnogat_T.stat-+

comment:80 Changed 2 weeks ago by acc

[13324] Trunk changes to achieve reproducibility of tracer.stat files in SETTE with nn_hls=2. There is still an untraced dependency on nn_hls in the tracer.stat values but REPRO_4_8 and REPRO_8_4 are now in agreement for each set of runs with each value of nn_hls (with ln_icebergs=.false.)

comment:81 Changed 2 weeks ago by acc

In 13327:

Final trunk change to obtain tracer.stat consistency and independence from nn_hls. The trunk is still restartable and reproducible with SETTE (ln_icebergs = F) AND run.stat and tracer.stat files match for both nn_hls = 1 and nn_hls = 2. This final piece was found by Seb last week but was missed from the commit. #2366

comment:82 Changed 2 weeks ago by acc

The logic in the previous commit is a little obscure because of the role of the mig and mjg arrays. These are no longer indices into a fixed size global array but refer to a global array whose size depends on nn_hls. It may be better to use the mig0 and mjg0 arrays (which always return (1,1) for the bottom left-hand corner of the input grid) but then precautions will be necessary against zero and negative indices that halo points can return. TBD.

Note: See TracTickets for help on using tickets.