wiki:Documentation/UserGuide/ReBuild

Version 3 (modified by luyssaert, 9 years ago) (diff)

--

Identify to missing history file(s)

When running a long simulation it sometimes happens that a single year is missing from in your store directory. On curie that would be, for example,

/ccc/store/cont003/dsm/p529luy/IGCM_OUT/OL2/PROD/secsto/JOB_NAME/SBG/Output/MO

Recover the missing files from the scratch

Sometimes LibIGCM has difficulties to copy the history files from the scratch to the store. So it is good to first look in the complementary scratch directory.

/ccc/scratch/cont003/dsm/p529luy/IGCM_OUT/OL2/PROD/secsto/JOB_NAME/SBG/Output/MO

If the missing output file(s) are there, manually copy them to the store. Remember to copy both the SBG and SRF files.

Rebuild the missing history file(s)

If your run did not crash (check run.card) but there is still a history file missing LibIGCM may have encountered a problem with rebuild job. To check whether this is the case you should look into

/ccc/scratch/cont003/dsm/p529luy/IGCM_OUT/OL2/PROD/SECSTO/ACF/REBUILD/REBUILD_XXXX

Where XXXX is the year, month or day for which you are missing an output file. In that folder you should find the same number of stomate and sechiba files as the number of processors you are running with. Thus if you run with 32 procs the rebuild folder should contain 64 files (32 for sechiba and 32 for stomate).

Copy the file rebuild_fromWorkdir.job from the libIGCM folder to your config folder

cp /ccc/work/cont003/dsm/p529luy/DOFOCO.SPINUP/libIGCM/rebuild_fromWorkdir.job /ccc/work/cont003/dsm/cheny/DOFOCO.SPINUP/config/ORCHIDEE_OL/ACF/rebuild_fromWorkdir.job

You will to adjust the job so it does exactly what you want most of the comments are self explaining but NOTE the following: to avoid confusion on the queue (that's how I understood it) LibIGCM starts to rebuild the last year and then goes backward in time. This implies that you have to list the start of the last time step. In the example below the year 1965 will be rebuild.

#D- Path to libIGCM
#D- Default : value from AA_job if any
# WARNING For StandAlone use : To run this script on some machine (ulam and cesium)
# WARNING you must check MirrorlibIGCM variable in sys library.
# WARNING If this variable is true, you must use libIGCM_POST path instead
# WARNING of your running libIGCM directory.
libIGCM=${libIGCM:=/ccc/work/cont003/dsm/cheny/DOFOCO.SPINUP/libIGCM}

#-D- $hostname of the MASTER job when SUBMIT_DIR is not visible on postprocessing computer.
MASTER=${MASTER:=ada|curie}

#D- Do we rebuild parallel output from archive or from ${BIGDIR}
#D- Default : value from AA_job if any
RebuildFromArchive=${RebuildFromArchive:=false}

#D- Directory where files we need to rebuild are store
#D- Default : value from AA_job if any
#D- if RebuildFromArchive=true REBUILD_DIR=${DMFDIR}/IGCM_OUT/.../JobName/TMP
#D- example : /dmnfs09/cont003/p86denv/IGCM_OUT/IPSLCM5/CM5PIRC7/TMP
#D- if RebuildFromArchive=false REBUILD_DIR=${BIGDIR}/REBUILD/TagName/JobName/
#D- example : /scratch/cont003/p86denv/REBUILD/IPSLCM5/SCAL-NEW
REBUILD_DIR=${REBUILD_DIR:=/ccc/scratch/cont003/dsm/p529luy/IGCM_OUT/OL2/PROD/SECSTO/ACF/REBUILD}

#D- How many directory to rebuild we have to consider
#D- Default : value from AA_job if any
NbRebuildDir=1

#D- Suffix date we will use to determine which directory to rebuild
#D- We will rebuild NbRebuildDir before and including PeriodDateBegin
#D- Default : value from AA_job if any
LastPeriodForRebuild=${LastPeriodForRebuild:=${PeriodDateBegin:=19650101}}

The original instruction can be found at http://forge.ipsl.jussieu.fr/igcmg_doc/wiki/DocGmonitor#Startorrestartpostprocessingjobs1