New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
#1057 (Bug in mppini_2.h90 which can result in communication deadlock with some partitioning (mainly evident at high processor counts)) – NEMO

Opened 11 years ago

Closed 11 years ago

Last modified 10 years ago

#1057 closed Bug (fixed)

Bug in mppini_2.h90 which can result in communication deadlock with some partitioning (mainly evident at high processor counts)

Reported by: acc Owned by: acc
Priority: low Milestone:
Component: OCE Version: v3.4
Severity: Keywords:
Cc:

Description

There appears to be a small error in mppini_2.h90 which results in the wrong northern neighbour being identified for the northernmost row of processors. This is a slightly redundant calculation anyway because the north-fold communications are dealt with separately and do not rely on the identified northern neighbour (nono). However, the northern neighbour is used to set the nbondj value which determines whether a region communicates: just to the north; both north and south; just to the south or neither way. At very high processor counts it is possible to end up with regions on the jpnj-1 row which send to the north but whose northern neighbour has been assigned a nbondj value of 2 (neither way). This results in deadlock at the first lbc_lnk call (usually in iom_get called by hgr_read) with the jpni-1 row processor waiting for a message that is never sent.

The error (TBC) appears to be in this block of code:

         ipolj(ii,ij) = 0
         IF( jperio == 3 .OR. jperio == 4 ) THEN
            ijm1 = jpni*(jpnj-1)
            imil = ijm1+(jpni+1)/2
            IF( jarea > ijm1 ) ipolj(ii,ij) = 3
            IF( MOD(jpni,2) == 1 .AND. jarea == imil ) ipolj(ii,ij) = 4
            IF( ipolj(ii,ij) == 3 ) iono(ii,ij) = jpni*jpnj-jarea+ijm1
         ENDIF

which applies a north-fold condition to identify the northern neighbour. I believe the error is that the iono values should be MPI process numbers not the narea vaules as calculated. The iono array is referenced later during the elimination of land-only regions:

      DO jarea = 1, jpni*jpnj
         iproc = jarea-1
         ii = 1 + MOD(jarea-1,jpni)
         ij = 1 +    (jarea-1)/jpni
         IF( ipproc(ii,ij) == -1 .AND. iono(ii,ij) >= 0   &
            .AND. iono(ii,ij) <= jpni*jpnj-1 ) THEN
            iino = 1 + MOD(iono(ii,ij),jpni)
            ijno = 1 +    (iono(ii,ij))/jpni
            IF( ibondj(iino,ijno) == 1 ) ibondj(iino,ijno)=2
            IF( ibondj(iino,ijno) == 0 ) ibondj(iino,ijno) = -1
         ENDIF

and the mis-identification can lead to the problem described. The occurrence is rare ( e.g. 1 process out of 9014 resulting from a 110x120 partitioning of ORCA_R12) but catastrophic and difficult to trace.

Fortunately, if this diagnosis is correct, the solution is trivial, simply replace:

           IF( ipolj(ii,ij) == 3 ) iono(ii,ij) = jpni*jpnj-jarea+ijm1

with

           IF( ipolj(ii,ij) == 3 ) iono(ii,ij) = jpni*jpnj-jarea+ijm1 - 1

Tests of this hypothesis are currently queued.

Commit History (2)

ChangesetAuthorTimeChangeLog
3819acc2013-02-21T11:31:10+01:00

Branch dev_v3_4_STABLE_2012. #1057. Correct mppini_2.h90 logic concerning the northern neighbour across the north-fold

3818acc2013-02-21T11:30:14+01:00

Branch dev_MERGE_2012. #1057. Correct mppini_2.h90 logic concerning the northern neighbour across the north-fold

Change History (3)

comment:1 Changed 11 years ago by acc

The proposed solution has partially resolved the issue but a different pair of north-fold neighbours is now dead-locking. I think there is also a fault in the logic of the second code block, namely:

      DO jarea = 1, jpni*jpnj
         iproc = jarea-1
         ii = 1 + MOD(jarea-1,jpni)
         ij = 1 +    (jarea-1)/jpni
         IF( ipproc(ii,ij) == -1 .AND. iono(ii,ij) >= 0   &
            .AND. iono(ii,ij) <= jpni*jpnj-1 ) THEN
            iino = 1 + MOD(iono(ii,ij),jpni)
            ijno = 1 +    (iono(ii,ij))/jpni
            IF( ibondj(iino,ijno) == 1 ) ibondj(iino,ijno)=2
            IF( ibondj(iino,ijno) == 0 ) ibondj(iino,ijno) = -1
         ENDIF

I read this as "If you are a land-only area with an active north-neighbour, then disable the southward communication on the north-neighbour". However, on the jpnj row the north-neighbour communicates via its north interface across the north-fold. For this row, it is the northward communication that needs to be disabled. Tests are currently queued to try this solution:

      DO jarea = 1, jpni*jpnj
         iproc = jarea-1
         ii = 1 + MOD(jarea-1,jpni)
         ij = 1 +    (jarea-1)/jpni
         IF( ipproc(ii,ij) == -1 .AND. iono(ii,ij) >= 0   &
            .AND. iono(ii,ij) <= jpni*jpnj-1 ) THEN
            iino = 1 + MOD(iono(ii,ij),jpni)
            ijno = 1 +    (iono(ii,ij))/jpni
              ! Need to reverse the logical direction of communication
              ! for northern neighbours of northern row processors (north-fold)
              ! i.e. need to check that the northern neighbour only communicates
              ! to the SOUTH (or not at all) if this area is land-only
            idir = 1
            IF( ij .eq. jpnj .AND. ijno .eq. jpnj ) idir = -1
            IF( ibondj(iino,ijno) == idir ) ibondj(iino,ijno)=2
            IF( ibondj(iino,ijno) == 0 ) ibondj(iino,ijno) = -idir

where idir is a local integer.

comment:2 Changed 11 years ago by acc

  • Resolution set to fixed
  • Status changed from new to closed

Successfully tested with various processor counts up to 12,000 cores. No recurrence of the dead-locking problem. Changes submitted to both dev_MERGE_2012 and dev_v3_4_STABLE_2012 branches. Closing ticket.

comment:3 Changed 10 years ago by acc

The problem of deadlocking has been revisited at version 3.6 (#1324). It looks as if the original conclusion regarding the iono values at the northern row was incorrect. The -1 offsets are not required since the calculation was already returning the correct MPI rank values.
The second part of this solution regarding the ibondj settings is probably sufficient, in itself, to resolve the issue and this is the preliminary conclusion of #1324

Note: See TracTickets for help on using tickets.