New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
Developers/RebuildZooms – NEMO
wiki:Developers/RebuildZooms

Version 3 (modified by acc, 4 years ago) (diff)

--

Does XIOS add sufficient and accurate attribute metadata to rebuild zoom datasets correctly?

There appears to be insufficient or incorrect information in zoom domain files to rebuild whole datasets when the zoom region spans more than one XIOS server and is written to multiple files.

For example, consider this zoom in a 8x4 decomposition of ORCA2_ICE_PISCES defined by the following additions to the XML:

domain_def_nemo.xml:
     <!--   My zoom: example of hand defined zoom   -->
     <domain id="myzoomT" domain_ref="grid_T" >
       <zoom_domain ibegin="25" jbegin="20" ni="90" nj="45"/>
     </domain>

grid_def_nemo.xml:
       <grid id="zoom_T_3D" >
         <domain domain_ref="myzoomT" />
         <axis axis_ref="deptht" />
       </grid>

file_def_nemo-oce.xml:
    <file_definition type="multiple_file" name="@expname@_@freq@_@startdate@_@enddate@" sync_freq="1mo" min_digits="4">

      <file_group id="5d" output_freq="5d"  output_level="10" enabled=".TRUE.">  <!-- 5d files -->
        <file id="file66" name_suffix="_zoom_T" description="ocean T grid variables" >
          <field field_ref="e3t"  grid_ref="zoom_T_3D"    />
          <field field_ref="toce" grid_ref="zoom_T_3D" name="thetao"   operation="instant" freq_op="5d" > @toce_e3t / @e3t </field>
        </file>
        <file id="file11" ....

In a 8x4 decomposition using 4 external XIOS servers the following output files are produced for the 90x45 zoom region:

O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc
O2L3P_LONG_5d_00010101_00010303_zoom_T_0001.nc

with the following attribute data in each respectively:

ncdump -h O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc
// global attributes:
                .
                .
		:ibegin = 25 ;
		:ni = 90 ;
		:jbegin = 20 ;
		:nj = 17 ;
		:DOMAIN_number_total = 4 ;
		:DOMAIN_number = 0 ;
		:DOMAIN_dimensions_ids = 2, 3 ;
		:DOMAIN_size_global = 180, 148 ;
		:DOMAIN_size_local = 90, 17 ;
		:DOMAIN_position_first = 26, 21 ;
		:DOMAIN_position_last = 115, 37 ;
		:DOMAIN_halo_size_start = 0, 0 ;
		:DOMAIN_halo_size_end = 0, 0 ;
		:DOMAIN_type = "box" ;

ncdump -h O2L3P_LONG_5d_00010101_00010303_zoom_T_0001.nc
                .
                .
		:ibegin = 25 ;
		:ni = 90 ;
		:jbegin = 37 ;
		:nj = 28 ;
		:DOMAIN_number_total = 4 ;
		:DOMAIN_number = 1 ;
		:DOMAIN_dimensions_ids = 2, 3 ;
		:DOMAIN_size_global = 180, 148 ;
		:DOMAIN_size_local = 90, 28 ;
		:DOMAIN_position_first = 26, 38 ;
		:DOMAIN_position_last = 115, 65 ;
		:DOMAIN_halo_size_start = 0, 0 ;
		:DOMAIN_halo_size_end = 0, 0 ;
		:DOMAIN_type = "box" ;

The production of two files is correct because only two of the 4 XIOS servers are dealing with the zoom region. The data within each file is also correct but two issues with the attribute metadata prevent REBUILD_NEMO (and similar tools) from rebuilding the files correctly:

  • DOMAIN_number_total needs to be 2 not 4 otherwise REBUILD_NEMO will fail
  • DOMAIN_size_global will be used to determne the size of the collated dataset. What is actually wanted is to collate these data into a dataset of the whole zoom region (90x45). This information is not contained in the metadata.

The first issue could be dealt with using an ncatted command on the first dataset; for example:

rebuild_nemo -n nl.reb O2L3P_LONG_5d_00010101_00010303_zoom_T 2
file O2L3P_LONG_5d_00010101_00010303_zoom_T,  num_domains 2, num_threads 1
 Rebuilding the following files:
 O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc
 O2L3P_LONG_5d_00010101_00010303_zoom_T_0001.nc
 ERROR! : number of files to rebuild in file does not agree with namelist
 Attribute DOMAIN_number_total is :            4
 Number of files specified in namelist is:            2
2

can be fixed with:

ncatted -a DOMAIN_number_total,global,m,d,2 O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc

rebuild_nemo -n nl.reb O2L3P_LONG_5d_00010101_00010303_zoom_T 2
file O2L3P_LONG_5d_00010101_00010303_zoom_T,  num_domains 2, num_threads 1
 Rebuilding the following files:
 O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc
 O2L3P_LONG_5d_00010101_00010303_zoom_T_0001.nc
 Size of global arrays:          180         148
.
.
 Closing input files...
 Closing output file...
 NEMO rebuild completed successfully

This successfully rebuilds the zoom but places it in an otherwise empty global domain.

Fixing the second issue is trickier. Simply editing the DOMAIN_size_global settings will not suffice because REBUILD_NEMO also uses the DOMAIN_position_first information to place data within the global arrays. Changing the size but not the offset results in Bus errors.

Proposed action

Fixing the metadata at source (XIOS) may be possible. It appears to only involve one module file (see details, below) but it isn't clear how XIOS distinguishes between global domains and zooms (if it does at all). A pragmatic solution will be to add the missing zoom domain information via the XML files and to adapt REBUILD_NEMO to use this information if present. For example, adding to the file_def_nemo-oce.xml:

file_def_nemo-oce.xml:
    <file_definition type="multiple_file" name="@expname@_@freq@_@startdate@_@enddate@" sync_freq="1mo" min_digits="4">

      <file_group id="5d" output_freq="5d"  output_level="10" enabled=".TRUE.">  <!-- 5d files -->
        <file id="file66" name_suffix="_zoom_T" description="ocean T grid variables" >
          <field field_ref="e3t"  grid_ref="zoom_T_3D"    />
          <field field_ref="toce" grid_ref="zoom_T_3D" name="thetao"   operation="instant" freq_op="5d" > @toce_e3t / @e3t </field>
          <variable name="DOMAIN_size_zoom_i" type="int"> 90 </variable>
          <variable name="DOMAIN_size_zoom_j" type="int"> 45 </variable>
        </file>
        <file id="file11" ....

results in:

ncdump -h O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc
// global attributes:
                .
                .
		:ibegin = 25 ;
		:ni = 90 ;
		:jbegin = 20 ;
		:nj = 17 ;
		:DOMAIN_number_total = 2 ;
		:DOMAIN_number = 0 ;
		:DOMAIN_dimensions_ids = 2, 3 ;
		:DOMAIN_size_global = 180, 148 ;
		:DOMAIN_size_local = 90, 17 ;
		:DOMAIN_position_first = 26, 21 ;
		:DOMAIN_position_last = 115, 37 ;
		:DOMAIN_halo_size_start = 0, 0 ;
		:DOMAIN_halo_size_end = 0, 0 ;
		:DOMAIN_type = "box" ;
		:DOMAIN_size_zoom_i = 90 ;
		:DOMAIN_size_zoom_j = 45 ;

The remaining task is then to adapt REBUILD_NEMO so that if these new attributes are present:

  • DOMAIN_size_zoom_i and DOMAIN_size_zoom_j are used in place of DOMAIN_size_global
  • The ibegin and jbegin offsets are subtracted from the DOMAIN_position_first values when deciding where to place values into the output array.

The following changes to rebuild_nemo.F90 achieve the required result (a modified version of the full code is attached, see: rebuild_nemo_modified.F90):

  • REBUILD_NEMO/src/

    old new  
    7070   CHARACTER(LEN=50)  :: clibnc ! netcdf library version 
    7171 
    7272   INTEGER :: ndomain, ifile, ndomain_file, nslicesize, deflate_level 
    73    INTEGER :: ncid, outid, idim, istop 
    74    INTEGER :: natts, attid, xtype, varid, rbdims 
     73   INTEGER :: ncid, outid, idim, istop, istat 
     74   INTEGER :: natts, attid, xtype, varid, rbdims, rbdims1 
    7575   INTEGER :: jv, ndims, nvars, dimlen, dimids(4) 
    7676   INTEGER :: dimid, unlimitedDimId, di, dj, dr 
    7777   INTEGER :: nmax_unlimited, nt, ntslice 
     
    9292   INTEGER, DIMENSION(2) :: halo_start, halo_end, local_sizes 
    9393   INTEGER, DIMENSION(2) :: idomain, jdomain, rdomain, start_pos 
    9494   INTEGER :: ji, jj, jk, jl, jr 
     95   INTEGER :: ni_zoom, nj_zoom, ni_off, nj_off 
     96   LOGICAL :: iszoom 
    9597   INTEGER :: nargs                 ! number of arguments 
    9698   INTEGER, EXTERNAL :: iargc 
    9799 
     
    252254 
    253255   CALL check_nf90( nf90_get_att( ncid, nf90_global, 'DOMAIN_number_total', ndomain_file ) ) 
    254256   IF( ndomain /= ndomain_file ) THEN 
    255       WRITE(numerr,*) 'ERROR! : number of files to rebuild in file does not agree with namelist' 
     257      WRITE(numerr,*) 'WARNING! : number of files to rebuild in file does not agree with namelist' 
    256258      WRITE(numerr,*) 'Attribute DOMAIN_number_total is : ', ndomain_file 
    257259      WRITE(numerr,*) 'Number of files specified in namelist is: ', ndomain 
    258       STOP 2 
     260      istat = nf90_inquire_attribute( ncid, nf90_global, 'DOMAIN_size_zoom_i', xtype, rbdims, attid ) 
     261      IF ( istat == nf90_noerr ) THEN 
     262          WRITE(numerr,*) 'This looks like a zoom region so I will assume you know what you are doing' 
     263      ELSE 
     264          WRITE(numerr,*) 'This is a potentially fatal error' 
     265          STOP 2 
     266      ENDIF 
    259267   ENDIF 
    260268 
    261269!2.1 Set up the output file 
     
    275283 
    276284   ALLOCATE(global_sizes(rbdims)) 
    277285   CALL check_nf90( nf90_get_att( ncid, nf90_global, 'DOMAIN_size_global', global_sizes ) ) 
     286 
     287!2.2.0.1 Override global sizes if zoom attributes are found 
     288   istat = nf90_inquire_attribute( ncid, nf90_global, 'DOMAIN_size_zoom_i', xtype, rbdims1, attid ) 
     289   IF ( istat == nf90_noerr ) THEN 
     290       CALL check_nf90( nf90_get_att( ncid, nf90_global, 'DOMAIN_size_zoom_i', ni_zoom ) ) 
     291       iszoom = .true. 
     292   ELSE 
     293       iszoom = .false. 
     294   ENDIF 
     295   istat = nf90_inquire_attribute( ncid, nf90_global, 'DOMAIN_size_zoom_j', xtype, rbdims1, attid ) 
     296   IF ( istat == nf90_noerr ) THEN 
     297       CALL check_nf90( nf90_get_att( ncid, nf90_global, 'DOMAIN_size_zoom_j', nj_zoom ) ) 
     298       iszoom = .true. 
     299   ELSE 
     300       iszoom = .false. 
     301   ENDIF 
     302   IF( iszoom ) THEN 
     303       global_sizes(1) = ni_zoom 
     304       global_sizes(2) = nj_zoom 
     305       CALL check_nf90( nf90_get_att( ncid, nf90_global, 'ibegin', ni_off ) ) 
     306       CALL check_nf90( nf90_get_att( ncid, nf90_global, 'jbegin', nj_off ) ) 
     307   ENDIF 
     308 
    278309   IF (l_verbose) WRITE(numout,*) 'Size of global arrays: ', global_sizes 
    279310 
    280311 
     
    773804               halo_start(2) = 0 
    774805               di=rebuild_dims(1) 
    775806               dj=3-di 
     807            ELSEIF ( iszoom ) THEN 
     808               start_pos(di) = start_pos(di) - ni_off 
     809               start_pos(dj) = start_pos(dj) - nj_off 
    776810            ENDIF 
    777811 
    778812!3.3.1 Generate local domain interior sizes from local_sizes and halo sizes 

Notes for possibly tackling the problem at source

The attributes are written by XIOS in:

XIOS_2.5/src/io/nc4_data_output.cpp

by:

    if (server->intraCommSize > 1)
    {
       this->writeLocalAttributes(domain->zoom_ibegin,
                                  domain->zoom_ni,
                                  domain->zoom_jbegin,
                                  domain->zoom_nj,
                                  appendDomid);

       if (singleDomain)
       this->writeLocalAttributes_IOIPSL(dimXid, dimYid,
                                         domain->zoom_ibegin,
                                         domain->zoom_ni,
                                         domain->zoom_jbegin,
                                         domain->zoom_nj,
                                         domain->ni_glo,domain->nj_glo,
                                         server->intraCommRank,server->intraCommSize);


    }

and these functions are:

      void CNc4DataOutput::writeLocalAttributes
         (int ibegin, int ni, int jbegin, int nj, StdString domid)
      {
        try
        {
         SuperClassWriter::addAttribute(StdString("ibegin").append(domid), ibegin);
         SuperClassWriter::addAttribute(StdString("ni"    ).append(domid), ni);
         SuperClassWriter::addAttribute(StdString("jbegin").append(domid), jbegin);
         SuperClassWriter::addAttribute(StdString("nj"    ).append(domid), nj);
        }
        catch (CNetCdfException& e)
        {
           StdString msg("On writing Local Attributes: ");
           msg.append("In the context : ");
           CContext* context = CContext::getCurrent() ;
           msg.append(context->getId()); msg.append("\n");
           msg.append(e.what());
           ERROR("CNc4DataOutput::writeLocalAttributes \
                  (int ibegin, int ni, int jbegin, int nj, StdString domid)", << msg);
        }

      }

and

      void CNc4DataOutput::writeLocalAttributes_IOIPSL(const StdString& dimXid, const StdString& dimYid,
                                                       int ibegin, int ni, int jbegin, int nj, int ni_glo, int nj_glo, int rank, int size)
      {
         CArray<int,1> array(2) ;

         try
         {
           SuperClassWriter::addAttribute("DOMAIN_number_total",size ) ;
           SuperClassWriter::addAttribute("DOMAIN_number", rank) ;
           array = SuperClassWriter::getDimension(dimXid) + 1, SuperClassWriter::getDimension(dimYid) + 1;
           SuperClassWriter::addAttribute("DOMAIN_dimensions_ids",array) ;
           array=ni_glo,nj_glo ;
           SuperClassWriter::addAttribute("DOMAIN_size_global", array) ;
           array=ni,nj ;
           SuperClassWriter::addAttribute("DOMAIN_size_local", array) ;
           array=ibegin+1,jbegin+1 ;
           SuperClassWriter::addAttribute("DOMAIN_position_first", array) ;
           array=ibegin+ni-1+1,jbegin+nj-1+1 ;
           SuperClassWriter::addAttribute("DOMAIN_position_last",array) ;
           array=0,0 ;
           SuperClassWriter::addAttribute("DOMAIN_halo_size_start", array) ;
           SuperClassWriter::addAttribute("DOMAIN_halo_size_end", array);
           SuperClassWriter::addAttribute("DOMAIN_type",string("box")) ;
         }
         catch (CNetCdfException& e)
         {
           StdString msg("On writing Local Attributes IOIPSL \n");
           msg.append("In the context : ");
           CContext* context = CContext::getCurrent() ;
           msg.append(context->getId()); msg.append("\n");
           msg.append(e.what());
           ERROR("CNc4DataOutput::writeLocalAttributes_IOIPSL \
                  (int ibegin, int ni, int jbegin, int nj, int ni_glo, int nj_glo, int rank, int size)", << msg);
         }
      }

Attachments (3)

Download all attachments as: .zip