Changes between Version 13 and Version 14 of Documentation/UserGuide/FLUXNETValidation


Ignore:
Timestamp:
2019-11-29T11:33:18+01:00 (4 years ago)
Author:
mmcgrath
Comment:

Updating to document changes getting r6384 to work with ENSEMBLE, as well as give some tips on debugging. Not working yet.

Legend:

Unmodified
Added
Removed
Modified
  • Documentation/UserGuide/FLUXNETValidation

    v13 v14  
    3131}}} 
    3232 
    33 As of around r6358, the Python script in the config/ORCHIDEE_OL/MAKE_RUN_DEF folder started generating only orchidee_pft.def_* in a few directories: OOL_SEC_STO_FG1trans, OOL_SEC_STO_FG2, SPINUP, and some others.  You should make sure that your PARAM directory has all the run.defs it needs, as for a normal run : from the ENSEMBLE folder (or the folder you copied the ENSEMBLE folder to) cp ../OOL_SEC_STO_FG2/PARAM/* PARAM/ 
     33As of around r6358, the Python script in the config/ORCHIDEE_OL/MAKE_RUN_DEF folder started generating only orchidee_pft.def_* in a few directories: OOL_SEC_STO_FG1trans, OOL_SEC_STO_FG2, SPINUP, and some others.  You should make sure that your PARAM directory has all the run.defs it needs, as for a normal run : from the ENSEMBLE folder (or the folder you copied the ENSEMBLE folder to) cp ../OOL_SEC_STO_FG2/PARAM/* PARAM/.  Do the same for the SPINUP/SUBJOB directory (e.g. cp ../OOL_SEC_STO_FG2/PARAM/* SPINUP/SUBJOB/OOL_SEC_STO/PARAM/).  
    3434 
    3535I have noticed that the script will complain if a value is specified in fluxnet.card but not the run.def.  It will not complain if a value is specified in run.def and not fluxnet.card.  Check the [UserChoices] and [SubJobParams] sections of fluxnet.card.  Many of the UserChoices are already in SPINUP/COMP/spinup.card, and many of the SubJobParams are in the run.def.  It seems that the scripts make decisions based on what is in fluxnet.card, so this should typically take precedence. 
     
    103103The section in the fluxnet.card with [SubJobParams] deserves special mention.  As of a recent version of CAN, the run.def has been restructured to include two files: orchidee.def, orchidee_pft.def. This makes the run.def much neater and matches what is done in the coupled simulations.  However, the Job_ENSEMBLE script attempts to change some variables in the run.def that fall under the [SubJobParams] section.  To do this, it looks at the actual run.def file, not any included file.  If it does not find a line in the run.def corresponding to the lines in [SubJobParams], it will crash.  So make sure all the lines you specific under [SubJobParams] in fluxnet.card also explicitly appear in the PARAM/run.def file. 
    104104 
     105The addition of the orchidee.def and orchidee_pft.def required adding them to the [ParametersFiles] in SPINUP/SUBJOBS/OOL_SEC_STO/COMP/orchidee_ol.card, so that libIGCM copies the new files to the PARAM directory of the running code.  It also required changes to the driver, to select from the correct orchidee_pft.def file.  To fix this, I simply copied OOL_SEC_STO_FG2/COMP/orchidee_ol.* to SPINUP/SUBJOB/OOL_SEC_STO/COMP/.  This also required adding the following to the [UserVhoices] section in SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card 
     106 
     107{{{ 
     108NORESTART=n 
     109TIMELENGTH=y 
     110}}}  
     111 
     112I noticed that the names of the following filenames did not match what is written in the SPINUP/SUBJOB/OOL_SEC_STO/COMP/stomate.card file, which will cause problems later.  Make sure tthe filenames in the run.def/flunxet.card/stomate.card all match, and then copy PARAM/*def to SPINUP/SUBJOB/OOL_SEC_STO/PARAM/. 
     113 
     114{{{ 
     115Nammonium_FILE = ndep_nhx.nc 
     116Nnitrate_FILE = ndep_noy.nc 
     117Nfert_FILE = NONE 
     118Nmanure_FILE = NONE 
     119Nfert_cropland_FILE = nfert_cropland.nc 
     120Nmanure_cropland_FILE = nmanure_cropland.nc 
     121Nfert_pasture_FILE = nfert_pasture.nc 
     122Nmanure_pasture_FILE = nmanure_pasture.nc 
     123Nbnf_FILE= bnf.nc 
     124}}} 
     125 
    105126Similarly, values found in fluxnet.card [UserChoices] seem to be required in SPINUP/COMP/spinup.card, else it crashes. 
     127 
     128Some additional variables which need to be in run.def and not orchidee.def (anything with _AUTO_ or _AUTOBLOCKER_ after it?): 
     129 
     130{{{ 
     131STOMATE_HIST_DT = _AUTO_ 
     132STOMATE_RESTART_FILEIN = _AUTOBLOCKER_ 
     133SECHIBA_restart_in = _AUTOBLOCKER_ 
     134XIOS_ORCHIDEE_OK = _AUTOBLOCKER_ 
     135WRITE_STEP = _AUTO_ 
     136RIVER_DESC = _AUTO_ 
     137WRITE_STEP2 = _AUTO_  
     138SECHIBA_HISTFILE2 = _AUTO_ 
     139STOMATE_IMPOSE_CN = _AUTO_ 
     140}}} 
    106141 
    107142 
     
    135170ATM_CO2 =_AUTO_: DEFAULT = 350. 
    136171}}} 
    137 In some versions of the run.def, no DEFAULT value is given, but the .driver expects a default and will crash if it's not there. 
     172In some versions of the run.def, no DEFAULT value is given, but the .driver expects a default and will crash if it's not there.  Make sure these lines don't appear twice in the orchidee.def! 
     173 
     174The latest versions of the .card and .driver files expect the following to be present in the PARAM/run.def, as they try to modify these values: 
     175 
     176{{{ 
     177SECHIBA_restart_in=_AUTO_ 
     178XIOS_ORCHIDEE_OK=_AUTO_ 
     179STOMATE_RESTART_FILEIN=_AUTO_ 
     180}}} 
    138181 
    139182The scripts expect some variables in SPINUP/SUBJOBS/OOL_SEC_STO/COMP/sechiba.card, and will crash if you don't have them.  It tries to change them (perhaps based on fluxnet.card) and gives up if it doesn't find them in sechiba.card to change. 
     
    172215BE SURE TO CHECK THE USED RUN.DEFs.  These can be found by changing to the RUN_DIR when the job is running.  The scripts will add flags to the end of the run.def, and sometimes these may conflict with what you want to run. 
    173216 
     217== Debugging == 
     218These are some of the errors that I have run into, along with attempts at explaining why and where they may occur, and how to solve them. 
     219 
     220Error files can be found in many places, including (assuming a job name of FLUXNET and a site of FI-Hyy): 
     221 
     222{{{ 
     223FLUXNET/out.Job_ENSEMBLE 
     224FLUXNET/FI-HyyFLUXNET/out_qsub_FI-HyyFLUXNET 
     225FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 
     226FLUXNET/FI-HyyFLUXNET/STOI/Debug 
     227}}} 
     228In my experience, errors come from the following places: 
     229 
     230{{{ 
     231'''FLUXNET/out.Job_ENSEMBLE''': PARAM/run.def 
     232'''FLUXNET/FI-HyyFLUXNET/out_qsub_FI-HyyFLUXNET''': SPINUP/SUBJOB/OOL_SEC_STO/COMP/*card 
     233'''FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001''': SPINUP/SUBJOB/OOL_SEC_STO/COMP/*card, SPINUP/SUBJOB/OOL_SEC_STO/COMP/*driver, PARAM/run.def, fluxnet.card 
     234'''FLUXNET/FI-HyyFLUXNET/STOI/Debug''': SPINUP/SUBJOB/OOL_SEC_STO/COMP/*card, SPINUP/SUBJOB/OOL_SEC_STO/COMP/*driver, PARAM/run.def, or the ORCHIDEE model itself 
     235}}} 
     236I would recommend solving the "deepest" error first (e.g., fix an error in the STOI directory before trying to fix an error in out_qsub_FI-HyyFLUXNET). 
     237 
     238Here are some errors: 
     239{{{ 
     240In the file FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 
     241 
     242IGCM_debug_Exit :  IGCM_comp_modifyDefFile : The variable XIOS_ORCHIDEE_OK cannot be modified. It should be set to AUTO. 
     243}}} 
     244 
     245One solution is to modify the file SPINUP/SUBJOB/OOL_SEC_STO/COMP/sechiba.driver such that the following two lines 
     246{{{ 
     247      IGCM_comp_modifyDefFile blocker run.def XIOS_ORCHIDEE_OK y 
     248      ... 
     249      IGCM_comp_modifyDefFile blocker run.def XIOS_ORCHIDEE_OK n 
     250}}} 
     251 
     252become 
     253{{{ 
     254      IGCM_comp_modifyDefFile force run.def XIOS_ORCHIDEE_OK y 
     255      ... 
     256      IGCM_comp_modifyDefFile force run.def XIOS_ORCHIDEE_OK n 
     257}}} 
     258If you do this, the value of the variable will be overwritten, so you should confirm that all values which trigger this option (in this case, XIOS=y and XIOS_ORCHIDEE_OK=y) are set to match what you want.  In this case, the XIOS value was found in SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol_card, PARAM/run.def,fluxnet.card). 
     259 
     260Another error that is found: 
     261 
     262{{{ 
     263In the file FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 
     264 
     265IGCM_debug_Exit :  IGCM_comp_modifyDefFile : Variable STOMATE_OK_STOMATE is not set in correct file. It should be set in run.def. 
     266}}} 
     267This is generally a sign that a variable is in PARAM/orchidee.def and it needs to be in PARAM/run.def because libIGCM is trying to modify it, and libIGCM only knows to modify run.def at the moment.  You will need to do the same to SPINUP/SUBJOB/OOL_SEC_STO/PARAM/*def. 
     268 
     269Another error: 
     270{{{ 
     271In the file FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 
     272 
     273IGCM_debug_Exit :  IGCM_comp_modifyDefFile : Error in run.def: Variable=NINPUT_UPDATE is set 2 times 
     274}}} 
     275Generally means that a value appears in both PARAM/run.def (likely copied there from fluxnet.card) and PARAM/orchidee.def.  Need to delete the line in PARAM/orchidee.def, and then copy the whole PARAM directory to SPINUP/SUBJOB/OOL_SEC_STO/PARAM/. 
     276 
     277== Cleaning == 
     278If an ENSEMBLE run crashes, it can sometimes be difficult to clean up all the files so that you can easily relaunch the run after figuring out what went wrong.  In particular, each site creates a new directory, which can add up to a lot of directories.  It's possible that some of your runs overlap, too (i.e., they use the same base directory, but the current run only uses forested sites, while a different run used agricultural sites).  There may be a libIGCM tool that does this well, but if you aren't familiar with it, here is a short script that works.  Copy it to your submission directory (i.e., where you launch the ./Job_ENSEMBLE script), make it executable (e.g., chmod +x clean.sh), and launch it before re-launching the run (e.g., ./clean.sh).   
     279 
     280{{{ 
     281#!/usr/bin/bash 
     282simulation="FLUXNET" 
     283basedir="/home/scratch01/mmcgrath/IGCM_OUT/OL2/PROD/ensemble/" 
     284sites=( FI-Hyy FI-Sod ) 
     285 
     286for site in "${sites[@]}" 
     287do 
     288    rm -fr ${site}${simulation}  
     289    rm -fr ${basedir}${site}${simulation}* 
     290    rm -fr ${basedir}${site}${simulation}* 
     291    rm -fr ${basedir}${simulation}/${site}${simulation}* 
     292    rm -fr ${basedir}${simulation}/${site}${simulation}* 
     293    echo "$simulation $site" 
     294done 
     295 
     296rm -fr out.job_ensemble 
     297}}} 
     298 
     299All you need to do is modify the site list, basedir and simulation variables for your particular run. 
     300 
    174301== Speed == 
    175302Some timing tests were carried out with TAG2.1, TRUNK (r6096), and CAN (r6091) on obelix.  This revealed the importance of the NBUFF=0 keyword for running with FLUXNET data for a single site.  When running for a single site with forcing that has lower temporal resolution (e.g., CRUNCEP, which has six-hourly resolution instead of the 30 min resolution of FLUXNET), it's much less important.  The amount of data output for all runs was adjusted to give approximately the same size of files.  The optimized executables were used for all tests (-O3).