Opened 5 months ago

Closed 4 months ago

Last modified 4 months ago

#2381 closed Bug (fixed)

Faulty option 1 of the freshwater budget adjustment mechanism (`nn_fwb = 1`, module `sbcfwb`) when both `key_mpp_mpi` and `key_mpi2` are defined

Reported by: smueller Owned by: systeam
Priority: low Milestone:
Component: SBC Version: trunk
Severity: minor Keywords: fwb, lib_mpp
Cc:

Description

Context

After changing the freshwater budget correction to option 1 in a variant of the ORCA2_ICE_PISCES reference configuration (OCE without icebergs, ICE unchanged, TOP excluded), model outputs from runs which differ in the restart-file output frequency (different values of nn_stock) started to diverge following the first output of a restart file. A test of restartability, however, was found to be successful.

Analysis

An inspection of the code related to restart-file output indicated that the subroutine mpp_delay_rcv of module lib_mpp (called indirectly by subroutine rst_write of module restart via subroutine iom_delay_rst of module iom) could potentially indirectly affect the model forcing during the output of restart files, as it enforces the otherwise delayed evaluation of the freshwater budget in an MPI context (key_mpp_mpi was defined when the problem occurred) and thus reveal an error elsewhere. Further analysis revealed the incorrect implementation of the fallback option of MPI-2 compatibility (key_mpi2 was defined when the problem occurred): an incorrect argument list in the substitute mpi_allreduce calls leads to the storage of inappropriate request-id values in variable ndelayid, and, as a result, subroutine iom_delay_rcv not being called in the subsequent execution of the delayed operation; in the case of iom_delay_sum this omission results in the received value not being propagated to the output variable of subroutine mpi_delay_sum, even if the value has been received in the meantime. An exception to this erroneous behaviour occurs during the first timestep as well as during timesteps that output restart files, during which subroutine mpp_delay_rcv is called irrespective of the request-id value; as a result, the freshwater budget remains uncontrolled between these time steps.

Fix

The argument list of the mpi_allreduce calls that substitute mpi_iallreduce calls when key_mpi2 is defined should be corrected, and, following such mpi_allreduce calls, ndelayid(idvar) could be assigned a positive value to enable the expected mpp_delay_rcv calls at each time step:

  • src/OCE/LBC/lib_mpp.F90

     
    401401      ! send y_in into todelay(idvar)%y1d with a non-blocking communication 
    402402# if defined key_mpi2 
    403403      IF( ln_timing ) CALL tic_tac( .TRUE., ld_global = .TRUE.) 
    404       CALL  mpi_allreduce( y_in(:), todelay(idvar)%y1d(:), isz, MPI_DOUBLE_COMPLEX, mpi_sumdd, ilocalcomm, ndelayid(idvar), ierr ) 
     404      CALL  mpi_allreduce( y_in(:), todelay(idvar)%y1d(:), isz, MPI_DOUBLE_COMPLEX, mpi_sumdd, ilocalcomm, ierr ) 
     405      ndelayid(idvar) = 1 
    405406      IF( ln_timing ) CALL tic_tac(.FALSE., ld_global = .TRUE.) 
    406407# else 
    407408      CALL mpi_iallreduce( y_in(:), todelay(idvar)%y1d(:), isz, MPI_DOUBLE_COMPLEX, mpi_sumdd, ilocalcomm, ndelayid(idvar), ierr ) 
     
    468469      ! send p_in into todelay(idvar)%z1d with a non-blocking communication 
    469470# if defined key_mpi2 
    470471      IF( ln_timing ) CALL tic_tac( .TRUE., ld_global = .TRUE.) 
    471       CALL  mpi_allreduce( p_in(:), todelay(idvar)%z1d(:), isz, MPI_DOUBLE_PRECISION, mpi_max, ilocalcomm, ndelayid(idvar), ierr ) 
     472      CALL  mpi_allreduce( p_in(:), todelay(idvar)%z1d(:), isz, MPI_DOUBLE_PRECISION, mpi_max, ilocalcomm, ierr ) 
     473      ndelayid(idvar) = 1 
    472474      IF( ln_timing ) CALL tic_tac(.FALSE., ld_global = .TRUE.) 
    473475# else 
    474476      CALL mpi_iallreduce( p_in(:), todelay(idvar)%z1d(:), isz, MPI_DOUBLE_PRECISION, mpi_max, ilocalcomm, ndelayid(idvar), ierr ) 

Commit History (2)

ChangesetAuthorTimeChangeLog
12518smueller2020-03-05T17:01:00+01:00

Correction of the fallback option for MPI-2 compatibility in delayed global MPP operations (ticket #2382)

This changeset is the merging of changeset [12512] (see ticket #2381) into the release-4.0-HEAD version of NEMO.

12512smueller2020-03-05T13:17:12+01:00

Correction of the fallback option for MPI-2 compatibility in delayed global MPP operations (ticket #2381)

Change History (3)

comment:1 Changed 4 months ago by smueller

In 12512:

Correction of the fallback option for MPI-2 compatibility in delayed global MPP operations (ticket #2381)

comment:2 Changed 4 months ago by smueller

  • Resolution set to fixed
  • Status changed from new to closed

A variant of the ORCA2_ICE_PISCES restartability test with defined key_mpi2, nn_fwb = 1 in both LONG and SHORT, nn_stock = 496 in LONG, nn_stock = 248 in SHORT, and disabled PISCES component fails just before and succeeds after application of [12512].

Further, source:/NEMO/trunk@12512 passes SETTE.

Last edited 4 months ago by smueller (previous) (diff)

comment:3 Changed 4 months ago by smueller

In 12518:

Correction of the fallback option for MPI-2 compatibility in delayed global MPP operations (ticket #2382)

This changeset is the merging of changeset [12512] (see ticket #2381) into the release-4.0-HEAD version of NEMO.

Note: See TracTickets for help on using tickets.