Changes between Version 9 and Version 10 of Documentation/UserGuide/HangCrash


Ignore:
Timestamp:
2020-04-20T12:09:34+02:00 (4 years ago)
Author:
dgoll
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Documentation/UserGuide/HangCrash

    v9 v10  
    1 = How to find where the model is hanging = 
     1= How to find where the model is hanging  = 
    22 
    33Author: S. Luyssaert[[BR]] 
     
    88== Objectives == 
    99 
    10 This page provides some information on how to find if your model run is hanging somewhere or if it is still properly running (given that you have not obtained the final outputs that you expect). 
     10This page provides some information on how to test if your model run is hanging (crashed without terminating the job) or if it is still properly running (given that you have not obtained the final outputs that you expect). This can happen due to the use of unsupported ways to stop the execution of the model  
    1111 
    1212'''Context:'''  
     
    1717Open the Script_Output file and search for RUN_DIR. You should find a path that looks like /ccc/scratch/cont003/dsm/p529grat/RUN_DIR/XXX/XXX. This is where the model is actually running. If you are working on irene/jean-zay or ciclad you can simply go to that folder and check when the most recent changes were made and to which files. The time of the last changes should give you an indication of whether the model really hangs or whether you are just too impatient. If, however, you are working on OBELIX, the run directory is on the /scratch but the folder where the model is running is not accessible. Open the Job you want to run and search for RUN_DIR_PATH. The instruction will be commented out. This is a good place to specify the run directory you want to use, e.g., RUN_DIR_PATH=/scratch01/sluys/RUN_DIR. Delete the job that was hanging, launch it again and have a look in /scratch01/sluys/RUN_DIR. Details can be found at https://forge.ipsl.jussieu.fr/igcmg_doc/wiki/Doc/Setup  
    1818 
    19 === Allow the model to properly crash === 
     19=== (Avoid the use of) unsupported execution stop === 
    2020Did you follow the "coding guidelines"? If not, it is time to do so! Check the coding guidelines on the use of CALL ipslerr() instead of STOP. Replace all your STOP statements by a CALL to ipslerr(). Don't be lazy now and add proper information to the ipslerr function else ipslerr may do its job but you still won't know where the model crashes. 
    2121 
    22 === Make the model crash === 
     22=== Supported execution stop === 
    2323You can force the model to stop with the following lines of code. 
    2424{{{