New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
#127 (Arrays goout of bounds) – NEMO

Opened 16 years ago

Closed 13 years ago

#127 closed Bug (fixed)

Arrays goout of bounds

Reported by: a.r.porter@… Owned by: nemo
Priority: normal Milestone:
Component: OCE Version: v2
Severity: Keywords:
Cc:

Description

I am running the latest version of NEMO as obtained from the project repository. The problems I describe below have been reproduced on an IBM SP5 machine (www.hpcx.ac.uk) and a Cray XT4 (www.hector.ac.uk) using the IBM, PGI and Pathscale compilers.

Setting the key_vectopt_loop CPP key causes arrays to go out of bounds in many locations in the code. Turning this off improves things but there are still two places where errors can occur for some processor grids with the ORCA2 configuration:

~ line 423 in file cla.F90 (tra_gibraltar routine).

and in all the 'mean flow' corrections in diafwb.F90. e.g. within the loop starting at line ~206 and all subsequent, similar loops.

I attach versions of the two files that I have added fixes to. I am not sure whether they treat the physics correctly though.

Commit History (2)

ChangesetAuthorTimeChangeLog
2402cetlod2010-11-17T17:14:31+01:00

Update TOP component according to changes related to cross-land advection. see ticket #127

2392gm2010-11-15T22:20:05+01:00

v3.3beta: Cross Land Advection (ticket #127) full rewriting + MPP bug corrections

Attachments (5)

cla.F90 (29.8 KB) - added by nemo_user 16 years ago.
Modified cla.F90 source file
diafwb.F90 (27.0 KB) - added by nemo_user 16 years ago.
Modified diafwb.F90 source file
diafwb_gm.F90 (19.1 KB) - added by gm 15 years ago.
bug correction for mpp from Gurvan
diafwb_gm_style.F90 (18.0 KB) - added by gm 15 years ago.
bug correction for mpp from Gurvan + style and minor optimisation
cla_gm.F90 (41.4 KB) - added by gm 15 years ago.
merge of cla.F90, tra_cla.F90 and dynspg_cla.F90 with mpp bug fix. NEVER compiled

Download all attachments as: .zip

Change History (9)

Changed 16 years ago by nemo_user

Modified cla.F90 source file

Changed 16 years ago by nemo_user

Modified diafwb.F90 source file

comment:1 in reply to: ↑ description Changed 16 years ago by rblod

Replying to a.r.porter@stfc.ac.uk:

I am running the latest version of NEMO as obtained from the project repository. The problems I describe below have been reproduced on an IBM SP5 machine (www.hpcx.ac.uk) and a Cray XT4 (www.hector.ac.uk) using the IBM, PGI and Pathscale compilers.

Setting the key_vectopt_loop CPP key causes arrays to go out of bounds in many locations in the code. Turning this off improves things but there are still two places where errors can occur for some processor grids with the ORCA2 configuration:

~ line 423 in file cla.F90 (tra_gibraltar routine).

and in all the 'mean flow' corrections in diafwb.F90. e.g. within the loop starting at line ~206 and all subsequent, similar loops.

I attach versions of the two files that I have added fixes to. I am not sure whether they treat the physics correctly though.

HI,

Thanks a lot for reporting this bugs, with corrections attached

  • about the key vectop_loop, the idea is to have "controlled" arrays out of bounds, in order to collapse the loops. But there should not be out of bounds regarding the global size of the array (seeing it at 1D size), only regarding the dimensions.
  • about the fixes you send in cla, could you tell me the decomposition you used to have this problem

Rachid

comment:2 Changed 15 years ago by gm

comment on diafwb.F90
There is in fact 2 errors in mpp.
First, as you mention it, is an out-of-bound when the separation between processors is just at the point where the transport are computed. The solution you have proposed is ok, but a shorter way is simply to add a MAX in the DO loop bounds (see below)
Second, in mpp the fluxes t the 4 straits are computed on each local domain, and then, at the end of the routine a mpp_sum is performed. For the results to be ok, the computation have to be done only on interior domain grid-point otherwise the fluxes can be taken into account twice! Therefore a multiplication by tmask_i is required. (see below)

         DO ji = mi0(ii0), mi1(ii1)
            DO jj = mj0(ij0), mj1(ij1)
               DO jk = 1, jpk 
                  zt = 0.5 * ( tn(ji,jj,jk) + tn(ji+1,jj,jk) )
                  zs = 0.5 * ( sn(ji,jj,jk) + sn(ji+1,jj,jk) )
                  zu = un(ji,jj,jk) * fse3t(ji,jj,jk) * e2u(ji,jj)
...
         DO jj = mj0(ij0), mj1(ij1)
            DO ji = mi0(ii0), MAX( mi1(ii1), jpim1 )      ! max used to avoid mpp out-of-bound
               DO jk = 1, jpk 
                  zt = 0.5 * ( tn(ji,jj,jk) + tn(ji+1,jj,jk) )
                  zs = 0.5 * ( sn(ji,jj,jk) + sn(ji+1,jj,jk) )
                  zu = un(ji,jj,jk) * fse3t(ji,jj,jk) * e2u(ji,jj) * tmask_i(ji,jj)   ! only if interior point
...

In addition, there is also an error in the salt content computation. When lk_vvl=false, the salt content within the "ssh layer" have to be taken into account. Instead of :

         DO jk = 1, jpkm1
            DO jj = 2, jpjm1
               DO ji = fs_2, fs_jpim1   ! vector opt.
                  zwei  = e1t(ji,jj) * e2t(ji,jj) * fse3t(ji,jj,jk) * tmask(ji,jj,jk) * tmask_i(ji,jj)
                  a_salb = a_salb + ( sb(ji,jj,jk) - zsm0 ) * zwei
               END DO
            END DO
         END DO
         IF( lk_mpp )   CALL mpp_sum( a_salb )      ! sum over the global domain
...
         DO jk = 1, jpkm1   
            DO jj = 2, jpjm1
               DO ji = fs_2, fs_jpim1   ! vector opt.
                  zwei  = e1t(ji,jj) * e2t(ji,jj) * fse3t(ji,jj,jk) * tmask(ji,jj,jk) * tmask_i(ji,jj)
                  a_saln = a_saln + ( sn(ji,jj,jk) - zsm0 ) * zwei
                  zvol  = zvol  + zwei
               END DO
            END DO
         END DO
         IF( lk_mpp )   CALL mpp_sum( a_saln )      ! sum over the global domain
         IF( lk_mpp )   CALL mpp_sum( zvol )      ! sum over the global domain

one should do:

         DO jk = 1, jpkm1
            DO jj = 2, jpjm1
               DO ji = fs_2, fs_jpim1   ! vector opt.
                  zwei  = e1t(ji,jj) * e2t(ji,jj) * fse3t(ji,jj,jk) * tmask(ji,jj,jk) * tmask_i(ji,jj)
                  a_salb = a_salb + ( sb(ji,jj,jk) - zsm0 ) * zwei
               END DO
            END DO
         END DO
         IF( .NOT. lk_vvl )   a_salb = a_salb + SUM( e1t(:,:) * e2t(:,:) * sshb(:,:) * tmask_i(:,:) * ( sb(:,:,1) - zsm0 ) )
         IF( lk_mpp )   CALL mpp_sum( a_salb )      ! sum over the global domain
...
         DO jk = 1, jpkm1   
            DO jj = 2, jpjm1
               DO ji = fs_2, fs_jpim1   ! vector opt.
                  zwei  = e1t(ji,jj) * e2t(ji,jj) * fse3t(ji,jj,jk) * tmask(ji,jj,jk) * tmask_i(ji,jj)
                  a_saln = a_saln + ( sn(ji,jj,jk) - zsm0 ) * zwei
                  zvol  = zvol  + zwei
               END DO
            END DO
         END DO
         IF( .NOT. lk_vvl ) THEN                    ! correct with salt content in the ssh 'layer'
            a_saln = a_saln + SUM(  e1t(ji,jj) * e2t(ji,jj) * tmask_i(:,:) * ssh(:,:) * ( sn(:,:,1) - zsm0 )  )
            zvol   = zvol   + SUM(  e1t(ji,jj) * e2t(ji,jj) * tmask_i(:,:) * ssh(:,:)                         )
         ENDIF
         IF( lk_mpp )   CALL mpp_sum( a_saln )      ! sum over the global domain
         IF( lk_mpp )   CALL mpp_sum( zvol )        ! sum over the global domain

The resulting module can be found in attachment ( diafwb_gm.F90 )

I also add a module which include style and optimistion corrections (diafwb_gm_style.F90)

Both are given as examples, they have not been tested.

Gurvan

Changed 15 years ago by gm

bug correction for mpp from Gurvan

Changed 15 years ago by gm

bug correction for mpp from Gurvan + style and minor optimisation

comment:3 Changed 15 years ago by gm

comment on cla.F90

As you mention it, there is an out-of-bound when the separation between processors is just at the point where the transport are computed. This occurs for the Gibraltar strait. Nevertheless, for the other straits the physics is not treated correctly when the cutting is just in the strait area.

The solution I propose is to stop the model if the cutting is able to pose problem to the computation, i.e. if all the strait grid-points are not inside one of the local domain interior, that is inside (2:jpim1,2:jpjm1).

For example, in Gibraltar case, this can be done as follows:

in the header:

   INTEGER  ::   nbab, ngib, nhor   ! presence or not of required grid-point on local domain
   !                                ! for Bab-el-Mandeb, Gibraltar, and Hormuz straits

in this initialisation

      ngib = 0
      IF(  ( mj0(101) >=2 .AND. mj1(102) <= jpjm1 ) .AND.    &  !* (139:141,101:102) on the local pocessor
         & ( mi0(139) >=2 .AND. mi1(141) <= jpim1 )       )    ngib = 1 
      !
      ! test if there is no local domain that includes all required grid-points
      ztemp = REAL( ngib )
      IF( lk_mpp )   CALL mpp_sum( ztemp )      ! sum with other processors value
      IF( ztemp = 0 ) THEN                      ! 3 points in i-direction, this may be a problem with some cutting
           CALL ctl_stop( ' cross land advection at Gibraltar does not work with your processor cutting: change it' )
      ENDIF

and call the .._gibraltar routine only if ngib=1

The error not only concerns cla.F90, but also div_cla.F90 and dynspg_cla.F90. The IF( ngib == 1 ) should be used there too.



By the way, while reading all the cla modules, I found a lot of duplication and some errors.
For example, in div_cla, the summation of emp should multiplied by tmask_i to avoid duplication of halo points. Also, with the change in the position of the computation of ssh and wn in step (see the now-a-day trunk), the initialisation should be done when calling div_cla, not tra_cla.
in addition, considering the velocity instead of the horizontal divergence make the code unreadable.

At the end, I re-wrote everything, merging the 3 modules in single one, cla.F90 that can be found in attachment (cla_gm.F90 file).
I use the change in It is given as an example, since I did not even compile it.

Gurvan

Changed 15 years ago by gm

merge of cla.F90, tra_cla.F90 and dynspg_cla.F90 with mpp bug fix. NEVER compiled

comment:4 Changed 13 years ago by gm

  • Resolution set to fixed
  • Status changed from new to closed

Since v3.1, the diafwb module (now named sbcfwb) already takes into account all the pb identified in this ticket.

The cross land advection module (cla.F90) has been updated in v3.3beta taking into account of the above comment and the resulting new cla.F90 module.

In addition :

  • the call to cla_div has been moved in divcur routine (as for sbc_rnf_div routine)
  • idem for the call to cla_traadv moved in traadv.F90 (==> the cla is now available in all advection scheme !)
  • the call to cla_dynspg stay in dynspg_flt ==> cla only works in filtered (not a big work to make it works in other cases.
  • in cla_init a STOP has been added in case of lk_vvl=T. Here again, there is no conceptual pb to run cla with vvl. But this compatibility requires to be carefully checked (in particular, fse3t_0 should be used in this initialisation....)
  • all the cla routines are now in cla.F90 suppression of cla_div and cla_dynspg modules
  • n_cla (old non-DOCTOR namist name) has been changed into nn_cla every where

the update in v3.3beta can be found in the changeset:2392

Due to the very few users of the cla (option not used in coupled mode, since it fixes the transport through the "cross land" straits) it does not seem necessary to correct the v3.2 ===> v3.2 will remain unchanged.

Gurvan

Note: See TracTickets for help on using tickets.