Opened 7 years ago

Closed 6 years ago

Last modified 3 years ago

#1205 closed Bug (fixed)

Uninitialised allocatables in LIM3

Reported by: ufla Owned by: vancop
Priority: normal Milestone:
Component: LIM3 Version: release-3.6
Severity: Keywords: LIM*
Cc:

Description

We have been seeing crashes in NEMO+LIM3 for a while and suspect uninitialised allocatable arrays in LIM3. Here is the result of a test that confirms this for the variables vt_i and vt_s:

Both variables are allocated in ice_alloc()

ALLOCATE( u_ice(jpi,jpj) , v_ice(jpi,jpj) , tio_u(jpi,jpj) , tio_v(jpi,jpj) ,     &
   &      vt_i (jpi,jpj) , vt_s (jpi,jpj) , at_i (jpi,jpj) , ato_i(jpi,jpj) ,     &
   &      et_i (jpi,jpj) , et_s (jpi,jpj) , ot_i (jpi,jpj) , tm_i (jpi,jpj) ,     &
   &      bv_i (jpi,jpj) , smt_i(jpi,jpj)                                   , STAT=ierr(ii) )

If we add these lines

vt_i = HUGE(vt_i)
vt_s = HUGE(vt_s)

and run the code with floating point exceptions enabled, it gets stopped in lim_sbc_init at this point:

snwice_mass (:,:) = tms(:,:) * ( rhosn * vt_s(:,:) + rhoic * vt_i(:,:) )

and using a debugger we find that both vt_i and vt_s contain the HUGE value set after allocation.

We do suspect at least one other variable (e_i), but see it as likely that there are more. In fact, with the old LIM3 version as present in NEMO 3.3.1 we have added cpp key controlled initialisation of all NEMO/LIM variables to either zero or HUGE. The model runs stable with zero-based initialisation but crashes reproducible with initialisation to HUGE.

Commit History (0)

(No commits)

Change History (12)

comment:1 in reply to: ↑ description Changed 7 years ago by clem

This is a bug in the trunk which has been corrected in the new version of LIM3 (development branch dev_MERGE_2013). vt_i and vt_s are initialized after the call lim_var_agg in iceini.F90 A quick fix is to move lim_sbc_init in ice_init.F90 after the call to lim_var_agg(1) (therefore after the (IF .NOT. ln_rstart) statement).

I do not think there is something wrong with the initialization of e_i.

comment:2 Changed 7 years ago by ufla

Thanks a lot for the fast response! Is it advisable to use the LIM3 version of the dev_MERGE_2013 branch (being aware that all this is still at an experimental stage)? Or would you rather recommend the above quick fix?

comment:3 follow-up: Changed 7 years ago by clem

I would recommend to use LIM3 from the merge indeed. LIM3 has been largely modified and several bugs have been removed since version 3.4 (especially concerning salt and mass exchanges between ice/ocean). However you should be aware that testing ORCA2-LIM3 is on route and first results are coming soon (to confirm with Claire Levy at LOCEAN). But I think no major debuging will be done in LIM3 now.

comment:4 in reply to: ↑ 3 Changed 7 years ago by ufla

Replying to clem:

I would recommend to use LIM3 from the merge indeed. […]

Okay. Is it enough to use the LIM_SRC_3 directory from dev_MERGE_2013 and keep the rest from trunk? I'd like to focus on LIM3 changes for now and avoid other modifications, especially namlist changes etc, as much as possible.

comment:5 Changed 7 years ago by ufla

I'm not quite sure, but since we're at it: What about the sxage variable? I get a floating point exception (in lim_adv_x) and I couldn't find any initialisation except for restarts.

comment:6 Changed 7 years ago by clem

If you want to use the new LIM3 with version 3.4 of NEMO, I also update another branch regularly for my own studies: dev_r4028_CNRS_LIM3, which is basically the trunk NEMO from dev_r4028 but with the new LIM3. The new namelist_ice has to be taken from ORCA2_LIM3 configuration.
And yes, you are fully right about initialization of sxage. I committed a correction. Thanks

comment:7 Changed 7 years ago by ufla

  • Version changed from trunk to nemo_v3_6_alpha

I have been running more tests with the dev_MERGE_2013 version of NEMO/LIM3 and found another issue with uninitialised allocatable arrays:

The arrays e_i and old_e_i are allocated as

old_e_i(jpi,jpj,jkmax ,jpl)
e_i(jpi,jpj,jkmax,jpl)

i.e. with jkmax as the dimension size for layers. However, the actual number of layers appears to be nlay_i⇐jkmax. In most places, nlay_i is used as a loop bound when the arrays are accessed, nevertheless, in limupdate1.F90 it says

d_e_i_trp  (:,:,:,:) = e_i  (:,:,:,:) - old_e_i  (:,:,:,:)

which accesses the entire arrays e_i and old_e_i, including parts that where never initialised. This raises a floating point exception in my tests as the uninitialised parts contain denormalised values.

A solution would be to either assign proper initial values to the complete arrays e_i and old_e_i or to limit the above statement to

d_e_i_trp  (:,:,1:nlay_i,:) = e_i  (:,:,1:nlay_i,:) - old_e_i  (:,:,1:nlay_i,:)

Note that this is a platform depended problem and changes appearance with the number of MPI ranks used!

comment:8 Changed 7 years ago by clevy

  • Owner changed from NEMO team/LIM team? to vancop

comment:9 Changed 6 years ago by vancop

  • Resolution set to fixed
  • Status changed from new to closed

Fixed a while ago

comment:10 Changed 3 years ago by nemo

  • Keywords LIM* added

comment:11 Changed 3 years ago by nemo

  • Keywords release-3.6* added

comment:12 Changed 3 years ago by nemo

  • Keywords release-3.6* removed
Note: See TracTickets for help on using tickets.