Opened 6 years ago

Closed 6 years ago

#1324 closed Bug (fixed)

Bug in mppini_2.h90 which can result in communication deadlock with the 8 x 4 domain decomposition

Reported by: epico Owned by: nemo
Priority: low Milestone:
Component: OCE Version: release-3.6
Severity: Keywords:
Cc:

Description

This bug has been already reported in the ticket #1057. The fixes inserted in that ticket does not resolve the problem with 8x4 domain decomposition for the ORCA2 configuration when removing the land-processes.
The domain layout is as in the following

 ---------------------------------------
| 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 
 ---------------------------------------
| 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 |
 ---------------------------------------
|  8 |  9 | 10 | 11 | 12 | 13 | 14 | 15 | 
 ---------------------------------------
|  0 |  1 |  2 |  3 |  4 |  5 |  6 |  7 | 
 ---------------------------------------

In this decomposition the domains number 24 and 31 are land-only processes.
The problem is in the mpp_init2 routine and namely during the definition of the iono values for the northern domains.

         IF( jperio == 3 .OR. jperio == 4 ) THEN
            ijm1 = jpni*(jpnj-1)
            imil = ijm1+(jpni+1)/2
            IF( jarea > ijm1 ) ipolj(ii,ij) = 3
            IF( MOD(jpni,2) == 1 .AND. jarea == imil ) ipolj(ii,ij) = 4
            IF( ipolj(ii,ij) == 3 ) iono(ii,ij) = jpni*jpnj-jarea+ijm1 - 1   ! MPI rank of northern neighbour
         ENDIF
         IF( jperio == 5 .OR. jperio == 6 ) THEN
            ijm1 = jpni*(jpnj-1)
            imil = ijm1+(jpni+1)/2
            IF( jarea > ijm1) ipolj(ii,ij) = 5
            IF( MOD(jpni,2) == 1 .AND. jarea == imil ) ipolj(ii,ij) = 6
            IF( ipolj(ii,ij) == 5) iono(ii,ij) = jpni*jpnj-jarea+ijm1 - 1    ! MPI rank of northern neighbour
         ENDIF

for the considered configuration we have the following values for iono:
iono (8,4) = 23
iono (1,4) = 30

the right values for iono should be
iono (8,4) = 24
iono (1,4) = 31

Commit History (1)

ChangesetAuthorTimeChangeLog
4647epico2014-05-23T08:54:53+02:00

bug fix for the definition of the iono values. see ticket #1324

Change History (2)

comment:1 Changed 6 years ago by acc

I can no longer agree with my own conclusion under #1057 that the iono calculation was wrong. There is a confusing mix of narea values and MPI ranks here but it looks as if the original iono calculation (i.e. without the -1 's) was returning the MPI ranks. The second part of the fix detailed in #1057 is needed and is hopefully sufficient. It certainly works in the 8x4, jpnij=30 case that is in question here.

comment:2 Changed 6 years ago by mocavero

  • Resolution set to fixed
  • Status changed from new to closed

I agree the original iono calculation was right. Changing the iono calculation (without the -1 's) the 8x4, jpnij=30 case works well, so we can consider the bug fixed.

Note: See TracTickets for help on using tickets.