Opened 6 years ago

Closed 8 months ago

#391 closed defect (fixed)

problem with ipslerr_p(MPI_ABORT) at obelix : model stays hanging

Reported by: jgipsl Owned by: ajornet
Priority: major Milestone: Not scheduled yet
Component: Model architecture Version:
Keywords: Cc:


Using the current trunk (rev 4600) the ipslerr_p is called from the model with stop level 3, the model stays hanging. The error message is printed in out_orchidee in the run directory but the job stay running in queue.

For a simple test case, set in run.def


This will activate the coherence test in the code. If you set a RUN_DIR_PATH in the main job, you can see during run time that the model has written the output messages

0FATAL ERROR FROM ROUTINE control_initialize
0 --> Too shallow soil chosen for the thermodynamic for soil freezing
0 --> Adapt run.def with at least DEPTH_MAX=11
0 -->
0Fatal error from ORCHIDEE. STOP in ipslerr_p with code

but the job is still running in the queue (use qstat).


  • This is only seen when running with XIOS. When using only IOIPSL, the model stops correctly.
  • When the model stops from IOIPSL, for example if an input file is missing (for example, the execution is stopping correctly, even if XIOS is activated.

Tested modifications

Currently in ORCHIDEE/src_parallel/ioipsl_para.f90:


Changing into


seems to solve the case using XIOS in attached mode but still not the server mode. A better solution is needed.

These problems at obelix seems to be related to the problem at curie in ticket #236

Change History (6)

comment:1 Changed 6 years ago by jgipsl

[4683]: Added arguments as said above but it does not solve the problem.

comment:2 Changed 5 years ago by aducharne

  • Milestone set to ORCHIDEE 4.0
  • Owner changed from jgipsl to ajornet
  • Status changed from new to assigned

comment:3 Changed 3 years ago by luyssaert

  • Component changed from Anthropogenic processes to Model architecture
  • Milestone changed from ORCHIDEE 4.0 to Not scheduled yet

comment:4 Changed 8 months ago by bguenet

Bertrand Guenet will do a test with current trunk

comment:5 Changed 8 months ago by bguenet

Tests done with the trunk [7853] on obelix. The model now crash, it doesn't hang and the error message is clear.

comment:6 Changed 8 months ago by bguenet

  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.