Opened 16 years ago
Closed 13 years ago
#127 closed Bug (fixed)
Arrays goout of bounds
Reported by: | a.r.porter@… | Owned by: | nemo |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | OCE | Version: | v2 |
Severity: | Keywords: | ||
Cc: |
Description
I am running the latest version of NEMO as obtained from the project repository. The problems I describe below have been reproduced on an IBM SP5 machine (www.hpcx.ac.uk) and a Cray XT4 (www.hector.ac.uk) using the IBM, PGI and Pathscale compilers.
Setting the key_vectopt_loop CPP key causes arrays to go out of bounds in many locations in the code. Turning this off improves things but there are still two places where errors can occur for some processor grids with the ORCA2 configuration:
~ line 423 in file cla.F90 (tra_gibraltar routine).
and in all the 'mean flow' corrections in diafwb.F90. e.g. within the loop starting at line ~206 and all subsequent, similar loops.
I attach versions of the two files that I have added fixes to. I am not sure whether they treat the physics correctly though.
Commit History (2)
Changeset | Author | Time | ChangeLog |
---|---|---|---|
2402 | cetlod | 2010-11-17T17:14:31+01:00 | Update TOP component according to changes related to cross-land advection. see ticket #127 |
2392 | gm | 2010-11-15T22:20:05+01:00 | v3.3beta: Cross Land Advection (ticket #127) full rewriting + MPP bug corrections |
Attachments (5)
Change History (9)
Changed 16 years ago by nemo_user
comment:1 in reply to: ↑ description Changed 16 years ago by rblod
Replying to a.r.porter@stfc.ac.uk:
I am running the latest version of NEMO as obtained from the project repository. The problems I describe below have been reproduced on an IBM SP5 machine (www.hpcx.ac.uk) and a Cray XT4 (www.hector.ac.uk) using the IBM, PGI and Pathscale compilers.
Setting the key_vectopt_loop CPP key causes arrays to go out of bounds in many locations in the code. Turning this off improves things but there are still two places where errors can occur for some processor grids with the ORCA2 configuration:
~ line 423 in file cla.F90 (tra_gibraltar routine).
and in all the 'mean flow' corrections in diafwb.F90. e.g. within the loop starting at line ~206 and all subsequent, similar loops.
I attach versions of the two files that I have added fixes to. I am not sure whether they treat the physics correctly though.
HI,
Thanks a lot for reporting this bugs, with corrections attached
- about the key vectop_loop, the idea is to have "controlled" arrays out of bounds, in order to collapse the loops. But there should not be out of bounds regarding the global size of the array (seeing it at 1D size), only regarding the dimensions.
- about the fixes you send in cla, could you tell me the decomposition you used to have this problem
Rachid
comment:2 Changed 15 years ago by gm
comment on diafwb.F90
There is in fact 2 errors in mpp.
First, as you mention it, is an out-of-bound when the separation between processors is just at the point where the transport are computed. The solution you have proposed is ok, but a shorter way is simply to add a MAX in the DO loop bounds (see below)
Second, in mpp the fluxes t the 4 straits are computed on each local domain, and then, at the end of the routine a mpp_sum is performed. For the results to be ok, the computation have to be done only on interior domain grid-point otherwise the fluxes can be taken into account twice! Therefore a multiplication by tmask_i is required. (see below)
DO ji = mi0(ii0), mi1(ii1) DO jj = mj0(ij0), mj1(ij1) DO jk = 1, jpk zt = 0.5 * ( tn(ji,jj,jk) + tn(ji+1,jj,jk) ) zs = 0.5 * ( sn(ji,jj,jk) + sn(ji+1,jj,jk) ) zu = un(ji,jj,jk) * fse3t(ji,jj,jk) * e2u(ji,jj) ...
DO jj = mj0(ij0), mj1(ij1) DO ji = mi0(ii0), MAX( mi1(ii1), jpim1 ) ! max used to avoid mpp out-of-bound DO jk = 1, jpk zt = 0.5 * ( tn(ji,jj,jk) + tn(ji+1,jj,jk) ) zs = 0.5 * ( sn(ji,jj,jk) + sn(ji+1,jj,jk) ) zu = un(ji,jj,jk) * fse3t(ji,jj,jk) * e2u(ji,jj) * tmask_i(ji,jj) ! only if interior point ...
In addition, there is also an error in the salt content computation. When lk_vvl=false, the salt content within the "ssh layer" have to be taken into account. Instead of :
DO jk = 1, jpkm1 DO jj = 2, jpjm1 DO ji = fs_2, fs_jpim1 ! vector opt. zwei = e1t(ji,jj) * e2t(ji,jj) * fse3t(ji,jj,jk) * tmask(ji,jj,jk) * tmask_i(ji,jj) a_salb = a_salb + ( sb(ji,jj,jk) - zsm0 ) * zwei END DO END DO END DO IF( lk_mpp ) CALL mpp_sum( a_salb ) ! sum over the global domain ... DO jk = 1, jpkm1 DO jj = 2, jpjm1 DO ji = fs_2, fs_jpim1 ! vector opt. zwei = e1t(ji,jj) * e2t(ji,jj) * fse3t(ji,jj,jk) * tmask(ji,jj,jk) * tmask_i(ji,jj) a_saln = a_saln + ( sn(ji,jj,jk) - zsm0 ) * zwei zvol = zvol + zwei END DO END DO END DO IF( lk_mpp ) CALL mpp_sum( a_saln ) ! sum over the global domain IF( lk_mpp ) CALL mpp_sum( zvol ) ! sum over the global domain
one should do:
DO jk = 1, jpkm1 DO jj = 2, jpjm1 DO ji = fs_2, fs_jpim1 ! vector opt. zwei = e1t(ji,jj) * e2t(ji,jj) * fse3t(ji,jj,jk) * tmask(ji,jj,jk) * tmask_i(ji,jj) a_salb = a_salb + ( sb(ji,jj,jk) - zsm0 ) * zwei END DO END DO END DO IF( .NOT. lk_vvl ) a_salb = a_salb + SUM( e1t(:,:) * e2t(:,:) * sshb(:,:) * tmask_i(:,:) * ( sb(:,:,1) - zsm0 ) ) IF( lk_mpp ) CALL mpp_sum( a_salb ) ! sum over the global domain ... DO jk = 1, jpkm1 DO jj = 2, jpjm1 DO ji = fs_2, fs_jpim1 ! vector opt. zwei = e1t(ji,jj) * e2t(ji,jj) * fse3t(ji,jj,jk) * tmask(ji,jj,jk) * tmask_i(ji,jj) a_saln = a_saln + ( sn(ji,jj,jk) - zsm0 ) * zwei zvol = zvol + zwei END DO END DO END DO IF( .NOT. lk_vvl ) THEN ! correct with salt content in the ssh 'layer' a_saln = a_saln + SUM( e1t(ji,jj) * e2t(ji,jj) * tmask_i(:,:) * ssh(:,:) * ( sn(:,:,1) - zsm0 ) ) zvol = zvol + SUM( e1t(ji,jj) * e2t(ji,jj) * tmask_i(:,:) * ssh(:,:) ) ENDIF IF( lk_mpp ) CALL mpp_sum( a_saln ) ! sum over the global domain IF( lk_mpp ) CALL mpp_sum( zvol ) ! sum over the global domain
The resulting module can be found in attachment ( diafwb_gm.F90 )
I also add a module which include style and optimistion corrections (diafwb_gm_style.F90)
Both are given as examples, they have not been tested.
Gurvan
comment:3 Changed 15 years ago by gm
comment on cla.F90
As you mention it, there is an out-of-bound when the separation between processors is just at the point where the transport are computed. This occurs for the Gibraltar strait. Nevertheless, for the other straits the physics is not treated correctly when the cutting is just in the strait area.
The solution I propose is to stop the model if the cutting is able to pose problem to the computation, i.e. if all the strait grid-points are not inside one of the local domain interior, that is inside (2:jpim1,2:jpjm1).
For example, in Gibraltar case, this can be done as follows:
in the header:
INTEGER :: nbab, ngib, nhor ! presence or not of required grid-point on local domain ! ! for Bab-el-Mandeb, Gibraltar, and Hormuz straits
in this initialisation
ngib = 0 IF( ( mj0(101) >=2 .AND. mj1(102) <= jpjm1 ) .AND. & !* (139:141,101:102) on the local pocessor & ( mi0(139) >=2 .AND. mi1(141) <= jpim1 ) ) ngib = 1 ! ! test if there is no local domain that includes all required grid-points ztemp = REAL( ngib ) IF( lk_mpp ) CALL mpp_sum( ztemp ) ! sum with other processors value IF( ztemp = 0 ) THEN ! 3 points in i-direction, this may be a problem with some cutting CALL ctl_stop( ' cross land advection at Gibraltar does not work with your processor cutting: change it' ) ENDIF
and call the .._gibraltar routine only if ngib=1
The error not only concerns cla.F90, but also div_cla.F90 and dynspg_cla.F90. The IF( ngib == 1 ) should be used there too.
By the way, while reading all the cla modules, I found a lot of duplication and some errors.
For example, in div_cla, the summation of emp should multiplied by tmask_i to avoid duplication of halo points. Also, with the change in the position of the computation of ssh and wn in step (see the now-a-day trunk), the initialisation should be done when calling div_cla, not tra_cla.
in addition, considering the velocity instead of the horizontal divergence make the code unreadable.
At the end, I re-wrote everything, merging the 3 modules in single one, cla.F90 that can be found in attachment (cla_gm.F90 file).
I use the change in It is given as an example, since I did not even compile it.
Gurvan
Changed 15 years ago by gm
merge of cla.F90, tra_cla.F90 and dynspg_cla.F90 with mpp bug fix. NEVER compiled
comment:4 Changed 13 years ago by gm
- Resolution set to fixed
- Status changed from new to closed
Since v3.1, the diafwb module (now named sbcfwb) already takes into account all the pb identified in this ticket.
The cross land advection module (cla.F90) has been updated in v3.3beta taking into account of the above comment and the resulting new cla.F90 module.
In addition :
- the call to cla_div has been moved in divcur routine (as for sbc_rnf_div routine)
- idem for the call to cla_traadv moved in traadv.F90 (==> the cla is now available in all advection scheme !)
- the call to cla_dynspg stay in dynspg_flt ==> cla only works in filtered (not a big work to make it works in other cases.
- in cla_init a STOP has been added in case of lk_vvl=T. Here again, there is no conceptual pb to run cla with vvl. But this compatibility requires to be carefully checked (in particular, fse3t_0 should be used in this initialisation....)
- all the cla routines are now in cla.F90 suppression of cla_div and cla_dynspg modules
- n_cla (old non-DOCTOR namist name) has been changed into nn_cla every where
the update in v3.3beta can be found in the changeset:2392
Due to the very few users of the cla (option not used in coupled mode, since it fixes the transport through the "cross land" straits) it does not seem necessary to correct the v3.2 ===> v3.2 will remain unchanged.
Gurvan
Modified cla.F90 source file