Custom Query (115 matches)


Show under each result:

Results (97 - 99 of 115)

Ticket Resolution Summary Owner Reporter
#90 fixed MPI dead lock in XIOS ymipsl mcastril

We are experiencing a repetitive issue with XIOS 1.0 . It appeared using NEMO 3.6 stable and more than 2600 cores, and it seemed to be solved when using Intel 16 compiler and IMPI 5. However, after updating to NEMO 3.6 current stable, the problem appears when using 1920 or more cores. I don't really get how the NEMO revision change could affect to this, but there it is.

The problem is just in this line of client.cpp:

MPI_Send(buff,buffer.count(),MPI_CHAR,serverLeader,1,CXios::globalComm) ;

In the meanwhile the server.cpp is doing MPI_Iprobe continuosly in order to receive all the MPI_Send.

What we have observed is that using a high number of cores, around 80-100 of these cores get stucked at the MPI_Send, causing the run to hang and not complete. The fact that with a certain number of cores the issue appears 80% of the times but not always, made us think that could be related with the IMPI implementation.

#44 fixed problem with xios_field_is_activate ymipsl jgipsl

Concerning XIOS rev 477

The problem is described in the following exemple in ORCHIDEE :

       IF (xios_field_is_active("RootMoist") .OR. xios_field_is_active("DelSoilMoist") .OR. &
            xios_field_is_active("DelIntercept") .OR. xios_field_is_active("DelSWE") .OR. &
            xios_field_is_active("SoilWet")) THEN

          WRITE(numout,*) 'almaoutput has been set to true in xios_orchidee_init'
       END IF

Case 1) Everything is fine when these variables are declared only in field_def_orchidee.xml. almaoutput becomes FALSE. See exemple at curie : /ccc/scratch/cont003/dsm/p86ghatt/LMDZOR/XIOS/RUNDIR/TESTING/RUN_rev1960/prod_mpi_omp/TestAlma

Case 2) If one or several of these variables are set in file_def_orchidee.xml but the file is deactivated, the result is almaoutput=TRUE. This is not ok. See here an extract from file_def_orchidee.xml :

<file id="sechiba2" name="sechiba_out_2_xios" output_level="10" output_freq="1d" enabled=".FALSE.">
  <field field_ref="RootMoist" level="1"/>
  <field field_ref="Areas" level="1"/>

Exemple : /ccc/scratch/cont003/dsm/p86ghatt/LMDZOR/XIOS/RUNDIR/TESTING/RUN_rev1960/prod_mpi_omp/TestAlma2

I think the behaviour of xios_field_is_activate was correct in earlier versions of XIOS.

#45 fixed problem including a file containing a context in iodef.xml ymipsl jgipsl

Following exemple does not work :

iodef.xml :

<?xml version="1.0"?>
  <context id="xios">
      <variable_group id="buffer">
            buffer_size = 80000000
            buffer_server_factor_size = 2
      <variable_group id="parameters">
        <variable id="using_server" type="boolean">false</variable>
        <variable id="info_level" type="int">0</variable>

  <context src=".context_orchidee.xml"/>


context_orchidee.xml :

  <context id="orchidee">
    <field_definition src="./field_def_orchidee.xml"/>
    <file_definition src="./file_def_orchidee.xml"/>

      <domain id="domain_landpoints"/>

      <!-- Vertical axis and extra dimensions in sechiba -->
      <axis id="veget" standard_name="model_level_number" long_name="Vegetation types" unit="1"/>
      <axis id="laiax" standard_name="model_level_number" long_name="Nb LAI" unit="1"/>
      <axis id="solth" standard_name="model_level_number" long_name="Soil levels" unit="m"/>
      <axis id="soiltyp" standard_name="model_level_number" long_name="Soil types" unit="1"/>
      <axis id="nobio" standard_name="model_level_number" long_name="Other surface types" unit="1"/>
      <axis id="albtyp" standard_name="model_level_number" long_name="Albedo types" unit="1"/>
      <axis id="solay" standard_name="model_level_number" long_name="Hydrol soil levels" unit="m"/>
      <!-- Vertical axis and extra dimensions in stomate -->
      <axis id="PFT" standard_name="model_level_number" long_name="Plant functional type" unit="1"/>
      <axis id="P10" standard_name="model_level_number" long_name="Pool 10 years" unit="1"/>
      <axis id="P100" standard_name="model_level_number" long_name="Pool 100 years" unit="1"/>
      <axis id="P11" standard_name="model_level_number" long_name="Pool 10 years + 1" unit="1"/>
      <axis id="P101" standard_name="model_level_number" long_name="Pool 100 years + 1" unit="1"/>

The same example works if putting the contents from the file context_orchidee.xml directly into iodef.xml instead of including the file by src.

Note: See TracQuery for help on using queries.