Version 3 (modified by acc, 4 years ago) (diff) |
---|
Does XIOS add sufficient and accurate attribute metadata to rebuild zoom datasets correctly?
There appears to be insufficient or incorrect information in zoom domain files to rebuild whole datasets when the zoom region spans more than one XIOS server and is written to multiple files.
For example, consider this zoom in a 8x4 decomposition of ORCA2_ICE_PISCES defined by the following additions to the XML:
domain_def_nemo.xml: <!-- My zoom: example of hand defined zoom --> <domain id="myzoomT" domain_ref="grid_T" > <zoom_domain ibegin="25" jbegin="20" ni="90" nj="45"/> </domain> grid_def_nemo.xml: <grid id="zoom_T_3D" > <domain domain_ref="myzoomT" /> <axis axis_ref="deptht" /> </grid> file_def_nemo-oce.xml: <file_definition type="multiple_file" name="@expname@_@freq@_@startdate@_@enddate@" sync_freq="1mo" min_digits="4"> <file_group id="5d" output_freq="5d" output_level="10" enabled=".TRUE."> <!-- 5d files --> <file id="file66" name_suffix="_zoom_T" description="ocean T grid variables" > <field field_ref="e3t" grid_ref="zoom_T_3D" /> <field field_ref="toce" grid_ref="zoom_T_3D" name="thetao" operation="instant" freq_op="5d" > @toce_e3t / @e3t </field> </file> <file id="file11" ....
In a 8x4 decomposition using 4 external XIOS servers the following output files are produced for the 90x45 zoom region:
O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc O2L3P_LONG_5d_00010101_00010303_zoom_T_0001.nc
with the following attribute data in each respectively:
ncdump -h O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc // global attributes: . . :ibegin = 25 ; :ni = 90 ; :jbegin = 20 ; :nj = 17 ; :DOMAIN_number_total = 4 ; :DOMAIN_number = 0 ; :DOMAIN_dimensions_ids = 2, 3 ; :DOMAIN_size_global = 180, 148 ; :DOMAIN_size_local = 90, 17 ; :DOMAIN_position_first = 26, 21 ; :DOMAIN_position_last = 115, 37 ; :DOMAIN_halo_size_start = 0, 0 ; :DOMAIN_halo_size_end = 0, 0 ; :DOMAIN_type = "box" ; ncdump -h O2L3P_LONG_5d_00010101_00010303_zoom_T_0001.nc . . :ibegin = 25 ; :ni = 90 ; :jbegin = 37 ; :nj = 28 ; :DOMAIN_number_total = 4 ; :DOMAIN_number = 1 ; :DOMAIN_dimensions_ids = 2, 3 ; :DOMAIN_size_global = 180, 148 ; :DOMAIN_size_local = 90, 28 ; :DOMAIN_position_first = 26, 38 ; :DOMAIN_position_last = 115, 65 ; :DOMAIN_halo_size_start = 0, 0 ; :DOMAIN_halo_size_end = 0, 0 ; :DOMAIN_type = "box" ;
The production of two files is correct because only two of the 4 XIOS servers are dealing with the zoom region. The data within each file is also correct but two issues with the attribute metadata prevent REBUILD_NEMO (and similar tools) from rebuilding the files correctly:
- DOMAIN_number_total needs to be 2 not 4 otherwise REBUILD_NEMO will fail
- DOMAIN_size_global will be used to determne the size of the collated dataset. What is actually wanted is to collate these data into a dataset of the whole zoom region (90x45). This information is not contained in the metadata.
The first issue could be dealt with using an ncatted command on the first dataset; for example:
rebuild_nemo -n nl.reb O2L3P_LONG_5d_00010101_00010303_zoom_T 2 file O2L3P_LONG_5d_00010101_00010303_zoom_T, num_domains 2, num_threads 1 Rebuilding the following files: O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc O2L3P_LONG_5d_00010101_00010303_zoom_T_0001.nc ERROR! : number of files to rebuild in file does not agree with namelist Attribute DOMAIN_number_total is : 4 Number of files specified in namelist is: 2 2
can be fixed with:
ncatted -a DOMAIN_number_total,global,m,d,2 O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc rebuild_nemo -n nl.reb O2L3P_LONG_5d_00010101_00010303_zoom_T 2 file O2L3P_LONG_5d_00010101_00010303_zoom_T, num_domains 2, num_threads 1 Rebuilding the following files: O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc O2L3P_LONG_5d_00010101_00010303_zoom_T_0001.nc Size of global arrays: 180 148 . . Closing input files... Closing output file... NEMO rebuild completed successfully
This successfully rebuilds the zoom but places it in an otherwise empty global domain.
Fixing the second issue is trickier. Simply editing the DOMAIN_size_global settings will not suffice because REBUILD_NEMO also uses the DOMAIN_position_first information to place data within the global arrays. Changing the size but not the offset results in Bus errors.
Proposed action
Fixing the metadata at source (XIOS) may be possible. It appears to only involve one module file (see details, below) but it isn't clear how XIOS distinguishes between global domains and zooms (if it does at all). A pragmatic solution will be to add the missing zoom domain information via the XML files and to adapt REBUILD_NEMO to use this information if present. For example, adding to the file_def_nemo-oce.xml:
file_def_nemo-oce.xml: <file_definition type="multiple_file" name="@expname@_@freq@_@startdate@_@enddate@" sync_freq="1mo" min_digits="4"> <file_group id="5d" output_freq="5d" output_level="10" enabled=".TRUE."> <!-- 5d files --> <file id="file66" name_suffix="_zoom_T" description="ocean T grid variables" > <field field_ref="e3t" grid_ref="zoom_T_3D" /> <field field_ref="toce" grid_ref="zoom_T_3D" name="thetao" operation="instant" freq_op="5d" > @toce_e3t / @e3t </field> <variable name="DOMAIN_size_zoom_i" type="int"> 90 </variable> <variable name="DOMAIN_size_zoom_j" type="int"> 45 </variable> </file> <file id="file11" ....
results in:
ncdump -h O2L3P_LONG_5d_00010101_00010303_zoom_T_0000.nc // global attributes: . . :ibegin = 25 ; :ni = 90 ; :jbegin = 20 ; :nj = 17 ; :DOMAIN_number_total = 2 ; :DOMAIN_number = 0 ; :DOMAIN_dimensions_ids = 2, 3 ; :DOMAIN_size_global = 180, 148 ; :DOMAIN_size_local = 90, 17 ; :DOMAIN_position_first = 26, 21 ; :DOMAIN_position_last = 115, 37 ; :DOMAIN_halo_size_start = 0, 0 ; :DOMAIN_halo_size_end = 0, 0 ; :DOMAIN_type = "box" ; :DOMAIN_size_zoom_i = 90 ; :DOMAIN_size_zoom_j = 45 ;
The remaining task is then to adapt REBUILD_NEMO so that if these new attributes are present:
- DOMAIN_size_zoom_i and DOMAIN_size_zoom_j are used in place of DOMAIN_size_global
- The ibegin and jbegin offsets are subtracted from the DOMAIN_position_first values when deciding where to place values into the output array.
The following changes to rebuild_nemo.F90 achieve the required result (a modified version of the full code is attached, see: rebuild_nemo_modified.F90):
-
REBUILD_NEMO/src/
old new 70 70 CHARACTER(LEN=50) :: clibnc ! netcdf library version 71 71 72 72 INTEGER :: ndomain, ifile, ndomain_file, nslicesize, deflate_level 73 INTEGER :: ncid, outid, idim, istop 74 INTEGER :: natts, attid, xtype, varid, rbdims 73 INTEGER :: ncid, outid, idim, istop, istat 74 INTEGER :: natts, attid, xtype, varid, rbdims, rbdims1 75 75 INTEGER :: jv, ndims, nvars, dimlen, dimids(4) 76 76 INTEGER :: dimid, unlimitedDimId, di, dj, dr 77 77 INTEGER :: nmax_unlimited, nt, ntslice … … 92 92 INTEGER, DIMENSION(2) :: halo_start, halo_end, local_sizes 93 93 INTEGER, DIMENSION(2) :: idomain, jdomain, rdomain, start_pos 94 94 INTEGER :: ji, jj, jk, jl, jr 95 INTEGER :: ni_zoom, nj_zoom, ni_off, nj_off 96 LOGICAL :: iszoom 95 97 INTEGER :: nargs ! number of arguments 96 98 INTEGER, EXTERNAL :: iargc 97 99 … … 252 254 253 255 CALL check_nf90( nf90_get_att( ncid, nf90_global, 'DOMAIN_number_total', ndomain_file ) ) 254 256 IF( ndomain /= ndomain_file ) THEN 255 WRITE(numerr,*) ' ERROR! : number of files to rebuild in file does not agree with namelist'257 WRITE(numerr,*) 'WARNING! : number of files to rebuild in file does not agree with namelist' 256 258 WRITE(numerr,*) 'Attribute DOMAIN_number_total is : ', ndomain_file 257 259 WRITE(numerr,*) 'Number of files specified in namelist is: ', ndomain 258 STOP 2 260 istat = nf90_inquire_attribute( ncid, nf90_global, 'DOMAIN_size_zoom_i', xtype, rbdims, attid ) 261 IF ( istat == nf90_noerr ) THEN 262 WRITE(numerr,*) 'This looks like a zoom region so I will assume you know what you are doing' 263 ELSE 264 WRITE(numerr,*) 'This is a potentially fatal error' 265 STOP 2 266 ENDIF 259 267 ENDIF 260 268 261 269 !2.1 Set up the output file … … 275 283 276 284 ALLOCATE(global_sizes(rbdims)) 277 285 CALL check_nf90( nf90_get_att( ncid, nf90_global, 'DOMAIN_size_global', global_sizes ) ) 286 287 !2.2.0.1 Override global sizes if zoom attributes are found 288 istat = nf90_inquire_attribute( ncid, nf90_global, 'DOMAIN_size_zoom_i', xtype, rbdims1, attid ) 289 IF ( istat == nf90_noerr ) THEN 290 CALL check_nf90( nf90_get_att( ncid, nf90_global, 'DOMAIN_size_zoom_i', ni_zoom ) ) 291 iszoom = .true. 292 ELSE 293 iszoom = .false. 294 ENDIF 295 istat = nf90_inquire_attribute( ncid, nf90_global, 'DOMAIN_size_zoom_j', xtype, rbdims1, attid ) 296 IF ( istat == nf90_noerr ) THEN 297 CALL check_nf90( nf90_get_att( ncid, nf90_global, 'DOMAIN_size_zoom_j', nj_zoom ) ) 298 iszoom = .true. 299 ELSE 300 iszoom = .false. 301 ENDIF 302 IF( iszoom ) THEN 303 global_sizes(1) = ni_zoom 304 global_sizes(2) = nj_zoom 305 CALL check_nf90( nf90_get_att( ncid, nf90_global, 'ibegin', ni_off ) ) 306 CALL check_nf90( nf90_get_att( ncid, nf90_global, 'jbegin', nj_off ) ) 307 ENDIF 308 278 309 IF (l_verbose) WRITE(numout,*) 'Size of global arrays: ', global_sizes 279 310 280 311 … … 773 804 halo_start(2) = 0 774 805 di=rebuild_dims(1) 775 806 dj=3-di 807 ELSEIF ( iszoom ) THEN 808 start_pos(di) = start_pos(di) - ni_off 809 start_pos(dj) = start_pos(dj) - nj_off 776 810 ENDIF 777 811 778 812 !3.3.1 Generate local domain interior sizes from local_sizes and halo sizes
Notes for possibly tackling the problem at source
The attributes are written by XIOS in:
XIOS_2.5/src/io/nc4_data_output.cpp
by:
if (server->intraCommSize > 1) { this->writeLocalAttributes(domain->zoom_ibegin, domain->zoom_ni, domain->zoom_jbegin, domain->zoom_nj, appendDomid); if (singleDomain) this->writeLocalAttributes_IOIPSL(dimXid, dimYid, domain->zoom_ibegin, domain->zoom_ni, domain->zoom_jbegin, domain->zoom_nj, domain->ni_glo,domain->nj_glo, server->intraCommRank,server->intraCommSize); }
and these functions are:
void CNc4DataOutput::writeLocalAttributes (int ibegin, int ni, int jbegin, int nj, StdString domid) { try { SuperClassWriter::addAttribute(StdString("ibegin").append(domid), ibegin); SuperClassWriter::addAttribute(StdString("ni" ).append(domid), ni); SuperClassWriter::addAttribute(StdString("jbegin").append(domid), jbegin); SuperClassWriter::addAttribute(StdString("nj" ).append(domid), nj); } catch (CNetCdfException& e) { StdString msg("On writing Local Attributes: "); msg.append("In the context : "); CContext* context = CContext::getCurrent() ; msg.append(context->getId()); msg.append("\n"); msg.append(e.what()); ERROR("CNc4DataOutput::writeLocalAttributes \ (int ibegin, int ni, int jbegin, int nj, StdString domid)", << msg); } }
and
void CNc4DataOutput::writeLocalAttributes_IOIPSL(const StdString& dimXid, const StdString& dimYid, int ibegin, int ni, int jbegin, int nj, int ni_glo, int nj_glo, int rank, int size) { CArray<int,1> array(2) ; try { SuperClassWriter::addAttribute("DOMAIN_number_total",size ) ; SuperClassWriter::addAttribute("DOMAIN_number", rank) ; array = SuperClassWriter::getDimension(dimXid) + 1, SuperClassWriter::getDimension(dimYid) + 1; SuperClassWriter::addAttribute("DOMAIN_dimensions_ids",array) ; array=ni_glo,nj_glo ; SuperClassWriter::addAttribute("DOMAIN_size_global", array) ; array=ni,nj ; SuperClassWriter::addAttribute("DOMAIN_size_local", array) ; array=ibegin+1,jbegin+1 ; SuperClassWriter::addAttribute("DOMAIN_position_first", array) ; array=ibegin+ni-1+1,jbegin+nj-1+1 ; SuperClassWriter::addAttribute("DOMAIN_position_last",array) ; array=0,0 ; SuperClassWriter::addAttribute("DOMAIN_halo_size_start", array) ; SuperClassWriter::addAttribute("DOMAIN_halo_size_end", array); SuperClassWriter::addAttribute("DOMAIN_type",string("box")) ; } catch (CNetCdfException& e) { StdString msg("On writing Local Attributes IOIPSL \n"); msg.append("In the context : "); CContext* context = CContext::getCurrent() ; msg.append(context->getId()); msg.append("\n"); msg.append(e.what()); ERROR("CNc4DataOutput::writeLocalAttributes_IOIPSL \ (int ibegin, int ni, int jbegin, int nj, int ni_glo, int nj_glo, int rank, int size)", << msg); } }
Attachments (3)
- rebuild_nemo_modified.F90 (64.1 KB) - added by acc 4 years ago.
- zoom_full.png (40.5 KB) - added by acc 4 years ago.
- zoom_only.png (23.7 KB) - added by acc 4 years ago.
Download all attachments as: .zip