Opened 17 months ago

Last modified 17 months ago

#183 new defect

XIOS3: Errors in generic_testcase/context_oce.xml that cause runtime failures

Reported by: acc Owned by: ymipsl
Priority: minor Component: XIOS
Version: trunk Keywords: XIOS3
Cc:

Description

generice_testcase.exe is meant to be a flexible test program but it is currently improperly configured for testing a toy ocean component. The context_oce.xml file contains several errors that cause some failures at run time (with nb_proc_oce > 0 in param.def).

  • All files are named with an atm_ prefix
  • two fields are included twice. The error in this case is explicit:
In file "nc4_data_output.cpp", function "void xios::CNc4DataOutput::writeField_(xios::CField *)",  line 1980 -> On writing field : field3D
In the context : default_pool_id__default_server_id_0__oce
Error when calling function  nc_def_var(ncid, varName.c_str(), xtype, nDims, dimIds, &varId)
NetCDF: String match to name in use
Unable to add a new variable with name: field3D with type 6 and number of dimension 4

but an exception is thrown and cores are dumped

  • an extra space in "FIELD_XY " An error is reported by the ocean clients as:
In file "field.cpp", function "void xios::CField::solveGridReference()",  line 1223 -> A grid must be defined for field 'oce__field_undef_id_6' .

which is virtually untraceable. This causes terminate called after throwing an instance of 'xios::CException' errors in the ocean clients

These changes fix the issues:

  • generic_testcase/context_oce.xml

     
    314314 
    315315 <file_definition  type="one_file" > 
    316316 
    317     <file id="atm_output" output_freq="1ts" type="one_file" enabled="true"> 
     317    <file id="oce_output" output_freq="1ts" type="one_file" enabled="true"> 
    318318      <field field_ref="field3D" /> 
    319319      <field field_ref="field2D" /> 
    320320      <field field_ref="pressure"  /> 
    321321      <field field_ref="field3D_resend" /> 
    322  
    323       <field field_ref="field3D"    enabled="true"/> 
    324       <field field_ref="field2D"    enabled="true"/> 
    325322      <field field_ref="field_X"    enabled="true"/> 
    326323      <field field_ref="field_Y"    enabled="true"/> 
    327       <field field_ref="field_XY "  enabled="true"/> 
     324      <field field_ref="field_XY"   enabled="true"/> 
    328325      <field field_ref="field_Z"    enabled="true"/> 
    329326      <field field_ref="field_XYZ"  enabled="true"/> 
    330327      <field field_ref="field_XZ"   enabled="true"/> 
    331328      <field field_ref="field_YZ"   enabled="true"/> 
    332329    </file> 
    333330 
    334     <file id="atm_output_other" output_freq="1ts" type="one_file" enabled="false"> 
     331    <file id="oce_output_other" output_freq="1ts" type="one_file" enabled="false"> 
    335332       <field field_ref="other_field3D"   enabled="true"/> 
    336333       <field field_ref="other_field2D"   enabled="true"/> 
    337334       <field field_ref="other_field_X"   enabled="true"/> 
     
    343340       <field field_ref="other_field_YZ"  enabled="true"/> 
    344341    </file> 
    345342 
    346     <file id="atm_output_W" output_freq="1ts" enabled="false"> 
     343    <file id="oce_output_W" output_freq="1ts" enabled="false"> 
    347344       <field field_ref="field3D_W"  enabled="true"/> 
    348345       <field field_ref="field2D_W"  enabled="true"/> 
    349346       <field field_ref="field_XW"   enabled="true"/> 

but better error reporting would be helpful.

This is reproducible on my x86_64 cluster using ifort Version 2021.4.0 regardless of setup. The actual setup used was:

Index: param.def
===================================================================
--- param.def	(revision 2432)
+++ param.def	(working copy)
@@ -1,5 +1,6 @@
 &params_run
 duration='4ts'
-nb_proc_atm=1
-nb_proc_oce=0
+nb_proc_atm=4
+nb_proc_oce=3
+nb_proc_surf=0
 /

and running with 10 cores (i.e. 3 servers)

Change History (2)

comment:1 Changed 17 months ago by jderouillat

This feature of the generic_testcase with different component has not been tested yet.

In my test, when I want to change the data distribution, I keep only nb_proc_atm different from 0, but I change the domain in iodef.xml. Default is lmdz :

<variable id="domain"> lmdz </variable>

You can set it to nemo :

<variable id="domain"> nemo </variable>

comment:2 Changed 17 months ago by jderouillat

I was so sure that it was not working (sorry Yann !) that I didn't read with attention to the end.

Thanks for your report, I'll commit your fix.

Note: See TracTickets for help on using tickets.