Custom Query (2547 matches)
Results (31 - 33 of 2547)
Ticket | Resolution | Summary | Owner | Reporter |
---|---|---|---|---|
#1020 | fixed | New branch for the merge of NOC and Mercator 2012 developments | nemo | acc |
Description |
First stage of the 2012 merge party: Merge NOC and Mercator developments. Details to appear on: |
|||
#1040 | fixed | NaNs originating in sbcdcy (diurnal cycle) at some points in space and time | nemo | acc |
Description |
Affects runs with ln_dm2dc = .true. It seems the algorithm in sbcdcy can fail in some circumstances. In particular I have points at Arctic circle latitudes which return -Infinity values for qsr despite valid inputs at the winter solstice. At the point and time in question dusk and dawn are almost coincident which leads to infinite values of rscal. This program emulatess the problem using values written out from sbcdcy: program tests REAL(KIND=8) :: fintegral, pt1, pt2, paaa, pbbb, pccc, ztwopi, zinvtwopi, zconvrad REAL(KIND=8) :: raa, rbb, rcc, rdawn, rdusk REAL(KIND=8) :: gphit, zdecrad,zdsws INTEGER :: nday fintegral( pt1, pt2, paaa, pbbb, pccc ) = & & paaa * pt2 + zinvtwopi * pbbb * SIN(pccc + ztwopi * pt2) & & - paaa * pt1 - zinvtwopi * pbbb * SIN(pccc + ztwopi * pt1) ztwopi = 2.d0 * 4.d0*ATAN(1.d0) zinvtwopi = 1.d0 / ztwopi zconvrad = ztwopi / 360.d0 ! ! Point near the Arctic circle in December ! values written out from sbcdcy.F90 for this point ! gphit = 6.6500000000000000E+01 zdecrad = -4.1015237421866746E-01 zdsws = 3.6500000000000000E+02 nday = 20 ! rdawn = 4.8254282104903723E-01 rdusk = 4.8254282579222396E-01 raa = -3.6567685080958523E-01 rbb = 3.6567685080958529E-01 rcc = -3.0319059782014595E+00 ! write(6,*) 'RSCAL is ' , 1.d0/fintegral(rdawn, rdusk, raa, rbb, rcc) ! Infinity!! end program tests I can avoid the problem by checking for very short days (sbcdcy.F90) : ! 2.2 Compute the scaling function: ! S* = the inverse of the time integral of the diurnal cycle from dawn to dusk DO jj = 1, jpj DO ji = 1, jpi IF ( ABS(rab(ji,jj)) < 1._wp ) THEN ! day duration is less than 24h rscal(ji,jj) = 0.0d0 IF ( rdawn(ji,jj) < rdusk(ji,jj) ) THEN ! day time in one part IF( (rdusk(ji,jj) - rdawn(ji,jj) ) .ge. 0.001_wp ) THEN ! bugfix rscal(ji,jj) = fintegral(rdawn(ji,jj), rdusk(ji,jj), raa(ji,jj), rbb(ji,jj), rcc(ji,jj)) rscal(ji,jj) = 1._wp / rscal(ji,jj) ENDIF ! bugfix ELSE ! day time in two parts IF( (rdusk(ji,jj) + (1._wp - rdawn(ji,jj)) ) .ge. 0.001_wp ) THEN ! bugfix rscal(ji,jj) = fintegral(0._wp, rdusk(ji,jj), raa(ji,jj), rbb(ji,jj), rcc(ji,jj)) & & + fintegral(rdawn(ji,jj), 1._wp, raa(ji,jj), rbb(ji,jj), rcc(ji,jj)) rscal(ji,jj) = 1. / rscal(ji,jj) ENDIF ! bugfix ENDIF ELSE IF ( raa(ji,jj) > rbb(ji,jj) ) THEN ! 24h day rscal(ji,jj) = fintegral(0._wp, 1._wp, raa(ji,jj), rbb(ji,jj), rcc(ji,jj)) rscal(ji,jj) = 1._wp / rscal(ji,jj) ELSE ! No day rscal(ji,jj) = 0.0_wp ENDIF ENDIF END DO END DO but this may not be the best solution? |
|||
#1057 | fixed | Bug in mppini_2.h90 which can result in communication deadlock with some partitioning (mainly evident at high processor counts) | acc | acc |
Description |
There appears to be a small error in mppini_2.h90 which results in the wrong northern neighbour being identified for the northernmost row of processors. This is a slightly redundant calculation anyway because the north-fold communications are dealt with separately and do not rely on the identified northern neighbour (nono). However, the northern neighbour is used to set the nbondj value which determines whether a region communicates: just to the north; both north and south; just to the south or neither way. At very high processor counts it is possible to end up with regions on the jpnj-1 row which send to the north but whose northern neighbour has been assigned a nbondj value of 2 (neither way). This results in deadlock at the first lbc_lnk call (usually in iom_get called by hgr_read) with the jpni-1 row processor waiting for a message that is never sent. The error (TBC) appears to be in this block of code: ipolj(ii,ij) = 0 IF( jperio == 3 .OR. jperio == 4 ) THEN ijm1 = jpni*(jpnj-1) imil = ijm1+(jpni+1)/2 IF( jarea > ijm1 ) ipolj(ii,ij) = 3 IF( MOD(jpni,2) == 1 .AND. jarea == imil ) ipolj(ii,ij) = 4 IF( ipolj(ii,ij) == 3 ) iono(ii,ij) = jpni*jpnj-jarea+ijm1 ENDIF which applies a north-fold condition to identify the northern neighbour. I believe the error is that the iono values should be MPI process numbers not the narea vaules as calculated. The iono array is referenced later during the elimination of land-only regions: DO jarea = 1, jpni*jpnj iproc = jarea-1 ii = 1 + MOD(jarea-1,jpni) ij = 1 + (jarea-1)/jpni IF( ipproc(ii,ij) == -1 .AND. iono(ii,ij) >= 0 & .AND. iono(ii,ij) <= jpni*jpnj-1 ) THEN iino = 1 + MOD(iono(ii,ij),jpni) ijno = 1 + (iono(ii,ij))/jpni IF( ibondj(iino,ijno) == 1 ) ibondj(iino,ijno)=2 IF( ibondj(iino,ijno) == 0 ) ibondj(iino,ijno) = -1 ENDIF and the mis-identification can lead to the problem described. The occurrence is rare ( e.g. 1 process out of 9014 resulting from a 110x120 partitioning of ORCA_R12) but catastrophic and difficult to trace. Fortunately, if this diagnosis is correct, the solution is trivial, simply replace: IF( ipolj(ii,ij) == 3 ) iono(ii,ij) = jpni*jpnj-jarea+ijm1 with IF( ipolj(ii,ij) == 3 ) iono(ii,ij) = jpni*jpnj-jarea+ijm1 - 1 Tests of this hypothesis are currently queued. |