Version 2 (modified by frrh, 4 years ago) (diff)

Bugs in NEMO-CICE-MEDUSA

Testing on the Met Office CrayXC40, using working copy u-aj777RHdebug and its variants we start by trying a 2-day (2x1-day cycles) set up with array bounds testing. Using -Rb for both NEMO and CICE code compilation we find:

  • TOP_SRC/MEDUSA/sms_medusa.F90 doesn't compile - ierr is OOB - it needs to be declared with a dimension of 8.
  • CICE generates:
      lib-4961 : WARNING 
      Subscript -330 is out of range for dimension 2 for array
      'array_g' at line 2206 in file 'ice_gather_scatter.F90' with bounds 1:332.
    
       ARRAY_G 2nd index is OOB in;
    
                  msg_buffer(i,j) = ARRAY_G(this_block%i_glob(i)+nghost,&
                                          this_block%j_glob(j)+nghost)
    
    Heaven knows why we have a negative number here!?

This doesn't cause a failure because its an OOB read. I suspect it would cause a failure if it was an OOB write.

  • We then get:
      lib-4213 : UNRECOVERABLE library error 
      A pointer or allocatable array in an I/O list has not been associated
      or allocated.
    
       Encountered during a namelist WRITE to unit 27
       Fortran unit 27 is connected to a sequential formatted text file:
         "output.namelist.pis"
    
  • trcnam_medusa.F90 has a section where it's initialising variables from the

natbio namelist. However it initialises jdms_input twice thus…

jdms_input = 0 jdms_input = 3

why? jdms_model is not initialised at all - is the 2nd occurrence supposed to refer to that?

jq10 is not initialised.

Some variables are declared twice in natbio. e.g. vsed, xhr

  • Writing of natbio causes the above error. Suggesting something in that namelist is unset.

Skipping that, we get a similar error writing natroam! Skip that and natopt seems to be OK but it's the only one of the three namelists that is. The model then goes on to complete (and completes a 2nd 1-day cycle OK).

  • The code also refers to a namelist named "nammeddia", but we have no such namelist. Our namelists refer to something called "nammedia" (only one "d") Presumably that's a typo. JP says this is not currently used in our configurations (if it was, it would crash looking for a missing namelist!)
  • Checking job.err, the only warning we have is the one about the ARRAY_G reference in CICE. This is present in both the NRUN and teh CRUN (why wouldn't it be?)

So we have a number of things to do:

1) Correct ierr dimension to 8 in sms_medusa.F90 2) Remove duplicate variable declarations in natbio 3) Ensure missing fields are given default values in natbio 4) Replace the 2nd occurrence of jdms_input with jdms_model, presumbaly 5) Investigate why the namelist writes fail.