Custom Query (126 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (103 - 105 of 126)

Ticket Resolution Summary Owner Reporter
#192 fixed Possible evidence of memory leaks (XIOS3) ymipsl acc
Description

XIOS3-trunk is still a work in progress so the purpose of this ticket is not to complain about possible memory leaks but rather to present some evidence of current behaviour that may aid in detection.

A eORCA025 configuration has been run on two clusters: an intel ice-lake cluster using ifort and intel-MPI and an AMD cluster using Cray compilers, Cray-MPICH and UCX protocols. The intel test successfully completes a year of integration with 1019 ocean cores and 52 xios3 servers. Full 5day, monthly and annual mean files are produced. The Cray test crashes out in month 7 with a probable OOM error. In both cases, monitoring the total memory used per node shows a persistent growth throughout the run. 5 day and monthly means are split at monthly intervals so usage should plateau after a few months. A clear cycle of monthly events is evident but usage never levels off. Some nodes on the intel cluster appear to free memory in the third quarter. Unfortunately, the Cray test reaches its limits before that point.

Graphs are attached:

#194 fixed xios test suite results file mhedley mhedley
Description

i think this applies to both xios-3.0-beta and trunk, but only properly tested on xios-3.0-beta

the checkfile.def test file for test_domain_algo https://forge.ipsl.jussieu.fr/ioserver/browser/XIOS3/branches/xios-3.0-beta/xios_test_suite/TEST_SUITE/test_domain_algo/checkfile.def#L5

lists an expected output file that is not produced by the test

i think that the test is running correctly, and that this file is not produced

i propose a change, removing line #5

atm_output_domain_transformation_zoom.nc

see attached xios-3.0-beta-test-run-pass demonstrating the change, compared to the failing run xios-3.0-beta-test-run-fail

  1. is this a valid change to propose?
  2. should it apply to both xios-3.0-beta and trunk?
#196 fixed XIOS3 not recognising matching dimensions ymipsl acc
Description

I'm having issues trying to create output containing fields with different source grids that map to the same global grid. In particular, this is a global nemo grid collated from MPP domains supplying a mixture of full MPP domain data and MPP domain data from the inner (haloes removed) section only. These grids are defined with the same name attribute:

  <grid id="grid_U_3D" >
    <domain domain_ref="grid_U" />
    <axis axis_ref="depthu" />
  </grid>
  <grid id="grid_U_3D_inner" >
    <domain domain_ref="grid_U_inner" name="grid_U" /> <!-- use name="grid_U" so we don't duplicate x, y, dimensions -->
    <axis axis_ref="depthu" />
  </grid>

but trying to write a file containing a mix of variables such as:

  <field_group id="grid_U"   grid_ref="grid_U_2D">

    <field id="e3u"    long_name="U-cell thickness"             standard_name="cell_thickness"       unit="m"      grid_ref="grid_U_3D_inner"   />
 
    <field id="uoce"   long_name="ocean current along i-axis"   standard_name="sea_water_x_velocity" unit="m/s"    grid_ref="grid_U_3D"   />
.
.
        <file id="file12" name_suffix="_grid_U" mode="write" gatherer="ugatherer" writer="uwriter" using_server2="true" description="ocean U grid variables" >
          <field field_ref="e3u" />
          <field field_ref="uoce"         name="uo".   />
.
.

results in:

cat xios_server_09.err
In file "nc4_data_output.cpp", function "void xios::CNc4DataOutput::writeDomain_(xios::CDomain *)",  line 476 -> On writing the domain : Opool__ugatherer_0__nemo__domain_undef_id_1
In the context : Opool__uwriter_0__nemo
Error when calling function nc_def_dim(ncid, dimName.c_str(), dimLen, &dimId)
NetCDF: String match to name in use
Unable to create dimension with name: x and with length 180

180 is the correct size for the global grid but the two level servers in play (ugatherer and uwriter) do not seem to recognise that the dimensions have already been defined. The contents of the incomplete output file are:

ncdump -h O2L3P_LONG_5d_00010101_00010303_grid_U.nc
netcdf O2L3P_LONG_5d_00010101_00010303_grid_U {
dimensions:
	axis_nbounds = 2 ;
	x = 180 ;
	y = 148 ;
	depthu = 31 ;
variables:
	float nav_lat(y, x) ;
		nav_lat:standard_name = "latitude" ;
		nav_lat:long_name = "Latitude" ;
		nav_lat:units = "degrees_north" ;
	float nav_lon(y, x) ;
		nav_lon:standard_name = "longitude" ;
		nav_lon:long_name = "Longitude" ;
		nav_lon:units = "degrees_east" ;
	float depthu(depthu) ;
		depthu:name = "depthu" ;
		depthu:long_name = "Vertical U levels" ;
		depthu:units = "m" ;
		depthu:positive = "down" ;
		depthu:bounds = "depthu_bounds" ;
	float depthu_bounds(depthu, axis_nbounds) ;
		depthu_bounds:units = "m" ;

// global attributes:
		:name = "O2L3P_LONG_5d_00010101_00010303_grid_U" ;
		:description = "ocean U grid variables" ;
		:title = "ocean U grid variables" ;
		:Conventions = "CF-1.6" ;

This is using rev 2634 of XIOS3 with the following services:

 <context id="xios" >
    <variable_definition>
      <variable_group id="buffer">
        <variable id="min_buffer_size" type="int">400000</variable>
        <variable id="optimal_buffer_size" type="string">performance</variable>
      </variable_group>

      <variable_group id="parameters" >
        <variable id="using_server" type="bool">true</variable>
        <variable id="info_level" type="int">0</variable>
        <variable id="print_file" type="bool">false</variable>
        <variable id="using_server2" type="bool">false</variable>
        <variable id="transport_protocol" type="string" >p2p</variable>
        <variable id="using_oasis"      type="bool">false</variable>
      </variable_group>
    </variable_definition>
    <pool_definition>
     <pool name="Opool" nprocs="12">
      <service name="tgatherer" nprocs="2" type="gatherer"/>
      <service name="igatherer" nprocs="2" type="gatherer"/>
      <service name="ugatherer" nprocs="2" type="gatherer"/>
      <service name="pgatherer" nprocs="2" type="gatherer"/>
      <service name="twriter" nprocs="1" type="writer"/>
      <service name="uwriter" nprocs="1" type="writer"/>
      <service name="iwriter" nprocs="1" type="writer"/>
      <service name="pwriter" nprocs="1" type="writer"/>
     </pool>
    </pool_definition>
  </context>

Is there a trick to making this work correctly? Or any tips to where to look for errors in my setup?

Note: See TracQuery for help on using queries.