Opened 3 years ago
Closed 3 years ago
#2456 closed Bug (fixed)
model does not stop properly in stpctl
Reported by: | smasson | Owned by: | systeam |
---|---|---|---|
Priority: | low | Milestone: | |
Component: | MULTIPLE | Version: | v4.0.* |
Severity: | minor | Keywords: | |
Cc: |
Description
Context
Same as #2418 but for the r4.0-HEAD
Analysis
see #2418
Recommendation
In order to limit the modifications done in the r4.0-HEAD:
- I reported only the needed modifications in OCE/stpctl.F90 (without the cleaning and the optimisations).
- I mande only the minimum changes in SAS/stpctl.F90 without adding a check of the min/max in this routine.
Commit History (4)
Changeset | Author | Time | ChangeLog |
---|---|---|---|
13137 | smasson | 2020-06-22T08:29:57+02:00 | r4.0-HEAD: fix maxval values on land subdomains for stpctl, see #2456 |
13116 | smasson | 2020-06-16T21:15:19+02:00 | |
13013 | smasson | 2020-06-03T10:33:06+02:00 | |
12859 | smasson | 2020-05-03T11:33:32+02:00 |
Change History (11)
comment:1 Changed 3 years ago by smasson
comment:2 Changed 3 years ago by smasson
- Resolution set to fixed
- Status changed from new to closed
fixed in [12859]
pass all sette tests and gives the same results as r4.0-HEAD@12857
Current code is : NEMO/releases/r4.0/r4.0-HEAD @ r12859 ( last change @ r12859 ) SETTE validation report generated for : NEMO/releases/r4.0/r4.0-HEAD @ r12859 (last changed revision) on X64_JEANZAY arch file !!---------------1st pass------------------!! !----restart----! WGYRE_PISCES_ST run.stat restartability passed : 12859 WGYRE_PISCES_ST tracer.stat restartability passed : 12859 WORCA2_ICE_PISCES_ST run.stat restartability passed : 12859 WORCA2_ICE_PISCES_ST tracer.stat restartability passed : 12859 WORCA2_OFF_PISCES_ST tracer.stat restartability passed : 12859 WAMM12_ST run.stat restartability passed : 12859 WORCA2_SAS_ICE_ST run.stat restartability passed : 12859 WAGRIF_DEMO_ST run.stat restartability passed : 12859 WSPITZ12_ST run.stat restartability passed : 12859 WISOMIP_ST run.stat restartability passed : 12859 WOVERFLOW_ST run.stat restartability passed : 12859 WLOCK_EXCHANGE_ST run.stat restartability passed : 12859 WVORTEX_ST run.stat restartability passed : 12859 WICE_AGRIF_ST run.stat restartability passed : 12859 !----repro----! WGYRE_PISCES_ST run.stat reproducibility passed : 12859 WGYRE_PISCES_ST tracer.stat reproducibility passed : 12859 WORCA2_ICE_PISCES_ST run.stat reproducibility passed : 12859 WORCA2_ICE_PISCES_ST tracer.stat reproducibility passed : 12859 WORCA2_OFF_PISCES_ST tracer.stat reproducibility passed : 12859 WAMM12_ST run.stat reproducibility passed : 12859 WORCA2_SAS_ICE_ST run.stat reproducibility passed : 12859 WORCA2_ICE_OBS_ST run.stat reproducibility passed : 12859 WAGRIF_DEMO_ST run.stat reproducibility passed : 12859 WSPITZ12_ST run.stat reproducibility passed : 12859 WISOMIP_ST run.stat reproducibility passed : 12859 WVORTEX_ST run.stat reproducibility passed : 12859 WICE_AGRIF_ST run.stat reproducibility passed : 12859 !----agrif check----! ORCA2 AGRIF vs ORCA2 NOAGRIF run.stat unchanged - passed : 12859 12859 !----result comparison check----! check result differences between : VALID directory : /gpfsscratch/rech/fqx/reee217/r4.0-HEAD/NEMO_VALIDATION at rev 12859 and REFERENCE directory : /gpfswork/rech/fqx/reee217/NEMO_VALIDATION/r4.0 at rev 12857 WGYRE_PISCES_ST run.stat files are identical WGYRE_PISCES_ST tracer.stat files are identical WORCA2_ICE_PISCES_ST run.stat files are identical WORCA2_ICE_PISCES_ST tracer.stat files are identical WORCA2_OFF_PISCES_ST tracer.stat files are identical WAMM12_ST run.stat files are identical WISOMIP_ST run.stat files are identical WORCA2_SAS_ICE_ST run.stat files are identical WAGRIF_DEMO_ST run.stat files are identical WSPITZ12_ST run.stat files are identical WISOMIP_ST run.stat files are identical WVORTEX_ST run.stat files are identical WICE_AGRIF_ST run.stat files are identical !!---------------2nd pass------------------!! !----restart----! !----repro----! !----agrif check----! !----result comparison check----! check result differences between : VALID directory : /gpfsscratch/rech/fqx/reee217/r4.0-HEAD/NEMO_VALIDATION at rev 12859 and REFERENCE directory : /gpfswork/rech/fqx/reee217/NEMO_VALIDATION/r4.0 at rev 12857
comment:3 Changed 3 years ago by smasson
- Resolution fixed deleted
- Status changed from closed to reopened
comment:4 Changed 3 years ago by smasson
In 13013:
comment:5 Changed 3 years ago by smasson
- Resolution set to fixed
- Status changed from reopened to closed
fixed in [13013]
[13013] pass all sette tests and gives same results as r4.0-HEAD@12926
Current code is : NEMO/releases/r4.0/r4.0-HEAD @ r13013 ( last change @ r13013 ) SETTE validation report generated for : NEMO/releases/r4.0/r4.0-HEAD @ r13013 (last changed revision) on X64_JEANZAY arch file !!---------------1st pass------------------!! !----restart----! WGYRE_PISCES_ST run.stat restartability passed : 13013 WGYRE_PISCES_ST tracer.stat restartability passed : 13013 WORCA2_ICE_PISCES_ST run.stat restartability passed : 13013 WORCA2_ICE_PISCES_ST tracer.stat restartability passed : 13013 WORCA2_OFF_PISCES_ST tracer.stat restartability passed : 13013 WAMM12_ST run.stat restartability passed : 13013 WORCA2_SAS_ICE_ST run.stat restartability passed : 13013 WAGRIF_DEMO_ST run.stat restartability passed : 13013 WSPITZ12_ST run.stat restartability passed : 13013 WISOMIP_ST run.stat restartability passed : 13013 WOVERFLOW_ST run.stat restartability passed : 13013 WLOCK_EXCHANGE_ST run.stat restartability passed : 13013 WVORTEX_ST run.stat restartability passed : 13013 WICE_AGRIF_ST run.stat restartability passed : 13013 !----repro----! WGYRE_PISCES_ST run.stat reproducibility passed : 13013 WGYRE_PISCES_ST tracer.stat reproducibility passed : 13013 WORCA2_ICE_PISCES_ST run.stat reproducibility passed : 13013 WORCA2_ICE_PISCES_ST tracer.stat reproducibility passed : 13013 WORCA2_OFF_PISCES_ST tracer.stat reproducibility passed : 13013 WAMM12_ST run.stat reproducibility passed : 13013 WORCA2_SAS_ICE_ST run.stat reproducibility passed : 13013 WORCA2_ICE_OBS_ST run.stat reproducibility passed : 13013 WAGRIF_DEMO_ST run.stat reproducibility passed : 13013 WSPITZ12_ST run.stat reproducibility passed : 13013 WISOMIP_ST run.stat reproducibility passed : 13013 WVORTEX_ST run.stat reproducibility passed : 13013 WICE_AGRIF_ST run.stat reproducibility passed : 13013 !----agrif check----! ORCA2 AGRIF vs ORCA2 NOAGRIF run.stat unchanged - passed : 13013 13013 !----result comparison check----! check result differences between : VALID directory : /gpfswork/rech/fqx/reee217/NEMO_ALL_VALIDATIONS/r4.0-HEAD/NEMO_VALIDATION at rev 13013 and REFERENCE directory : /gpfswork/rech/fqx/reee217/NEMO_ALL_VALIDATIONS/r4.0-HEAD/NEMO_VALIDATION at rev 12926 WGYRE_PISCES_ST run.stat files are identical WGYRE_PISCES_ST tracer.stat files are identical WORCA2_ICE_PISCES_ST run.stat files are identical WORCA2_ICE_PISCES_ST tracer.stat files are identical WORCA2_OFF_PISCES_ST tracer.stat files are identical WAMM12_ST run.stat files are identical WISOMIP_ST run.stat files are identical WORCA2_SAS_ICE_ST run.stat files are identical WAGRIF_DEMO_ST run.stat files are identical WSPITZ12_ST run.stat files are identical WISOMIP_ST run.stat files are identical WVORTEX_ST run.stat files are identical WICE_AGRIF_ST run.stat files are identical
comment:6 Changed 3 years ago by smasson
- Resolution fixed deleted
- Status changed from closed to reopened
Same as #2418, in stpctl, if
- nstop > 0 when entering the routine
- and we don't do collective communication
- and no other error are found in the tests on min/max values
=> We won't call ctl_stop and, once exiting stpctl, some processes will have nstop > 0, others won't.
This create an MPI deadlock.
comment:7 Changed 3 years ago by smasson
In 13116:
comment:8 Changed 3 years ago by smasson
- Resolution set to fixed
- Status changed from reopened to closed
[13116] passes all sette tests and gives the same results as r4.0-HEAD@13095
Current code is : NEMO/releases/r4.0/r4.0-HEAD @ r13116 ( last change @ r13116 ) SETTE validation report generated for : NEMO/releases/r4.0/r4.0-HEAD @ r13116 (last changed revision) on X64_IRENE arch file !!---------------1st pass------------------!! !----restart----! WGYRE_PISCES_ST run.stat restartability passed : 13116 WGYRE_PISCES_ST tracer.stat restartability passed : 13116 WORCA2_ICE_PISCES_ST run.stat restartability passed : 13116 WORCA2_ICE_PISCES_ST tracer.stat restartability passed : 13116 WORCA2_OFF_PISCES_ST tracer.stat restartability passed : 13116 WAMM12_ST run.stat restartability passed : 13116 WORCA2_SAS_ICE_ST run.stat restartability passed : 13116 WAGRIF_DEMO_ST run.stat restartability passed : 13116 WSPITZ12_ST run.stat restartability passed : 13116 WISOMIP_ST run.stat restartability passed : 13116 WOVERFLOW_ST run.stat restartability passed : 13116 WLOCK_EXCHANGE_ST run.stat restartability passed : 13116 WVORTEX_ST run.stat restartability passed : 13116 WICE_AGRIF_ST run.stat restartability passed : 13116 !----repro----! WGYRE_PISCES_ST run.stat reproducibility passed : 13116 WGYRE_PISCES_ST tracer.stat reproducibility passed : 13116 WORCA2_ICE_PISCES_ST run.stat reproducibility passed : 13116 WORCA2_ICE_PISCES_ST tracer.stat reproducibility passed : 13116 WORCA2_OFF_PISCES_ST tracer.stat reproducibility passed : 13116 WAMM12_ST run.stat reproducibility passed : 13116 WORCA2_SAS_ICE_ST run.stat reproducibility passed : 13116 WORCA2_ICE_OBS_ST run.stat reproducibility passed : 13116 WAGRIF_DEMO_ST run.stat reproducibility passed : 13116 WSPITZ12_ST run.stat reproducibility passed : 13116 WISOMIP_ST run.stat reproducibility passed : 13116 WVORTEX_ST run.stat reproducibility passed : 13116 WICE_AGRIF_ST run.stat reproducibility passed : 13116 !----agrif check----! ORCA2 AGRIF vs ORCA2 NOAGRIF run.stat unchanged - passed : 13116 13116 !----result comparison check----! check result differences between : VALID directory : /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/r4.0-HEAD/NEMO_VALIDATION at rev 13116 and REFERENCE directory : /ccc/work/cont005/ra0542/massons/NEMO_ALL_VALIDATIONS/r4.0-HEAD/NEMO_VALIDATION at rev 13095 WGYRE_PISCES_ST run.stat files are identical WGYRE_PISCES_ST tracer.stat files are identical WORCA2_ICE_PISCES_ST run.stat files are identical WORCA2_ICE_PISCES_ST tracer.stat files are identical WORCA2_OFF_PISCES_ST tracer.stat files are identical WAMM12_ST run.stat files are identical WISOMIP_ST run.stat files are identical WORCA2_SAS_ICE_ST run.stat files are identical WAGRIF_DEMO_ST run.stat files are identical WSPITZ12_ST run.stat files are identical WISOMIP_ST run.stat files are identical WVORTEX_ST run.stat files are identical WICE_AGRIF_ST run.stat files are identical
comment:9 Changed 3 years ago by smasson
- Resolution fixed deleted
- Status changed from closed to reopened
same as #2418:
MAXVAL with mask check can give back -HUGE value on land processors.
When sn_cfctl%l_runstat = F, these values will generate true for infinity tests:
ABS( zmax(1) + zmax(2) + zmax(3) ) > HUGE(1._wp)
comment:10 Changed 3 years ago by smasson
In 13137:
comment:11 Changed 3 years ago by smasson
- Resolution set to fixed
- Status changed from reopened to closed
[13137] fixes the problem. It passes all sette tests and gives the same results as [13095]
SETTE validation report generated for : @ r13137 (last changed revision) on X64_JEANZAY arch file !!---------------1st pass------------------!! !----restart----! WGYRE_PISCES_ST run.stat restartability passed : 13137 WGYRE_PISCES_ST tracer.stat restartability passed : 13137 WORCA2_ICE_PISCES_ST run.stat restartability passed : 13137 WORCA2_ICE_PISCES_ST tracer.stat restartability passed : 13137 WORCA2_OFF_PISCES_ST tracer.stat restartability passed : 13137 WAMM12_ST run.stat restartability passed : 13137 WORCA2_SAS_ICE_ST run.stat restartability passed : 13137 WAGRIF_DEMO_ST run.stat restartability passed : 13137 WSPITZ12_ST run.stat restartability passed : 13137 WISOMIP_ST run.stat restartability passed : 13137 WOVERFLOW_ST run.stat restartability passed : 13137 WLOCK_EXCHANGE_ST run.stat restartability passed : 13137 WVORTEX_ST run.stat restartability passed : 13137 WICE_AGRIF_ST run.stat restartability passed : 13137 !----repro----! WGYRE_PISCES_ST run.stat reproducibility passed : 13137 WGYRE_PISCES_ST tracer.stat reproducibility passed : 13137 WORCA2_ICE_PISCES_ST run.stat reproducibility passed : 13137 WORCA2_ICE_PISCES_ST tracer.stat reproducibility passed : 13137 WORCA2_OFF_PISCES_ST tracer.stat reproducibility passed : 13137 WAMM12_ST run.stat reproducibility passed : 13137 WORCA2_SAS_ICE_ST run.stat reproducibility passed : 13137 WORCA2_ICE_OBS_ST run.stat reproducibility passed : 13137 WAGRIF_DEMO_ST run.stat reproducibility passed : 13137 WSPITZ12_ST run.stat reproducibility passed : 13137 WISOMIP_ST run.stat reproducibility passed : 13137 WVORTEX_ST run.stat reproducibility passed : 13137 WICE_AGRIF_ST run.stat reproducibility passed : 13137 !----agrif check----! ORCA2 AGRIF vs ORCA2 NOAGRIF run.stat unchanged - passed : 13137 13137 !----result comparison check----! check result differences between : VALID directory : /gpfswork/rech/fqx/reee217/NEMO_ALL_VALIDATIONS/r4.0-HEAD/NEMO_VALIDATION at rev 13137 and REFERENCE directory : /gpfswork/rech/fqx/reee217/NEMO_ALL_VALIDATIONS/r4.0-HEAD/NEMO_VALIDATION at rev 13095 WGYRE_PISCES_ST run.stat files are identical WGYRE_PISCES_ST tracer.stat files are identical WORCA2_ICE_PISCES_ST run.stat files are identical WORCA2_ICE_PISCES_ST tracer.stat files are identical WORCA2_OFF_PISCES_ST tracer.stat files are identical WAMM12_ST run.stat files are identical WISOMIP_ST run.stat files are identical WORCA2_SAS_ICE_ST run.stat files are identical WAGRIF_DEMO_ST run.stat files are identical WSPITZ12_ST run.stat files are identical WISOMIP_ST run.stat files are identical WVORTEX_ST run.stat files are identical WICE_AGRIF_ST run.stat files are identical
In 12859: