Opened 4 years ago

Last modified 2 years ago

#1675 assigned Bug

Out of bounds index in loops - key_vectopt_loop

Reported by: jcastill Owned by: gm
Priority: low Milestone:
Component: OCE Version: trunk
Severity: Keywords: CPP out_of_bounds
Cc: gm

Description (last modified by nicolasmartin)

Many warnings are found in NEMO@5518 when it is compiled with out of bounds warnings (for the ftn compiler on Cray the option is -R bp). This may lead to segmentation faults, and wrong science results.

Most of the warnings are related with the fact that the index of many loops depend on the variables fs_2 and fs_jpim1, which particular values depend of the key_vectopt_loop key.

One piece of code taken from the routine dom_vvl_sf_nxt will hopefully illustrate the problem clearly:

         ! 3 - Thickness diffusion term
         ! ----------------------------
         zwu(:,:) = 0.0_wp
         zwv(:,:) = 0.0_wp
         ! a - first derivative: diffusive fluxes
         DO jk = 1, jpkm1
            DO jj = 1, jpjm1
               DO ji = 1, fs_jpim1   ! vector opt.
                  un_td(ji,jj,jk) = rn_ahe3 * umask(ji,jj,jk) * re2u_e1u(ji,jj) &
                                  & * ( tilde_e3t_b(ji,jj,jk) - tilde_e3t_b(ji+1,jj  ,jk) )
                  vn_td(ji,jj,jk) = rn_ahe3 * vmask(ji,jj,jk) * re1v_e2v(ji,jj) &
                                  & * ( tilde_e3t_b(ji,jj,jk) - tilde_e3t_b(ji  ,jj+1,jk) )
                  zwu(ji,jj) = zwu(ji,jj) + un_td(ji,jj,jk)
                  zwv(ji,jj) = zwv(ji,jj) + vn_td(ji,jj,jk)
               END DO
            END DO
         END DO

When key_vectopt_loop is on, fs_jpim1=jp1, when it is off, fs_jpim1=jpim1=jp1-1. If the key is on, in this loop the variable tilde_e3t_b will go out of bounds, as the size of its first dimension is jp1.

If the key is off, there are still other out of bounds issues. In both cases, there is a problem in some routines in timing.F90 about dissociated pointers.

Commit History (0)

(No commits)

Attachments (2)

key.err (3.6 MB) - added by jcastill 4 years ago.
Error dump with key_vectopt_loop on
nokey.err (571.0 KB) - added by jcastill 4 years ago.
Error dump with key_vectopt_loop off

Change History (13)

Changed 4 years ago by jcastill

Error dump with key_vectopt_loop on

Changed 4 years ago by jcastill

Error dump with key_vectopt_loop off

comment:1 Changed 3 years ago by gm

The issue may not be as dramatic as you thought it is.

Some history:

The key_vectopt_loop has been introduced in the code years ago at a time all big computers were vector computer (mostly NEC). On vector computer you can be much faster if the inner-loop vecteur is very long (up to 20 time faster!). Nevertheless the inner loop is, most of the time, only a i-loop, i.e. not a large one compare to the vector register of the computer. Furthermore, the compiler is unable to unroll i- and j-loops to make the inner loop longer as the starting/ending index of the i-loop is very often 2 / jpi-1. In order to overcome this issue, the key_vectopt_loop CPP key has been introduced to replace the starting i-index (fs_2=1 when the key is defined) and/or the ending i-index (fs_jpim1=jpi when the key is defined), so that the compiler can unroll the i- and j-loops, and have longer inner loop vector, resulting in faster calculation.

This should not create “true” out-of-bound as such transformation is always done when j-loop start at j=2 and/or end at j=jpj-1. In those cases the value used in the calculation will remains inside the array considered ( tilde_e3t_b in your example).

With or without the key_vectopt_loop, the results are identical to the last digits (at least this has been checked when introducing key_vectopt_loop years ago…).

Note that in some compiler you have an option in the out-of-bound diagnostics that checks only whether you are out-of-bound of a whole array, not whether you are out-of-bound of one of its dimension. If you have such an option, you can check out-of-bound issues with key_vectopt_loop, otherwise, you should only test out-of-bound without key_vectopt_loop. The test of key_vectopt_loop is to obtain the same results with and without key_vectopt_loop.

Now, most people are using scalar computers. On such computer I don’t expect the code to be faster with key_vectopt_loop. To my knowledge nobody have checked that. My recommandation is not to use this key on scalar computers as it should slow down the calculation (calculation on a larger number of grid-points without speed up of the calculation), unless someone shows us the contary .

My main concern with respect to your ticket is on the other out-of-bound issues you found when key_vectopt_loop is NOT defined.

Are they only limited to the timing routines ?

Are they changing the results ? (i.e. change in the digits of the restart file)

gurvan

comment:2 Changed 3 years ago by jcastill

Without the key defined, I have pointer warnings in the timing routines, but also some out of bound errors in the files bdydta.F90 and fldread.F90 (see previously attached files). It looks to me like some variables are not properly allocated, but I did not investigate it further.

The restart files with and without key are exactly the same except for the field 'mxln' (mixing length), in which case most of the numerical values are the same, except in some isolated cases where they are completely different. No differences are evident when plotting the field, though.

Juan Ma?

comment:3 Changed 3 years ago by nicolasmartin

  • Keywords of removed

comment:4 Changed 3 years ago by nicolasmartin

  • Keywords nemo_v3_6* added

comment:5 Changed 3 years ago by nicolasmartin

  • Keywords CPP added; key_vectopt_loop removed

comment:6 Changed 3 years ago by nicolasmartin

  • Keywords out removed

comment:7 Changed 3 years ago by nicolasmartin

  • Keywords out_of_bounds added; bounds removed

comment:8 Changed 2 years ago by clevy

  • Owner changed from nemo to gm

comment:9 Changed 2 years ago by clevy

  • Cc gm added
  • Status changed from new to assigned

comment:10 Changed 2 years ago by nicolasmartin

  • Description modified (diff)

comment:11 Changed 2 years ago by gm

  • Keywords nemo_v3_6* removed
  • Milestone 2015 nemo_v3_6_STABLE deleted
  • Version changed from release-3.6 to trunk

The described issue for comment 2 (by Juan Castillo) have to be solved for the trunk version, but will not be solved for the v3.6_STABLE

by Juan Castillo

Note: See TracTickets for help on using tickets.