wiki:Documentation/UserGuide/FLUXNETValidation

Version 19 (modified by mmcgrath, 9 months ago) (diff)

Documentation for running FLUXNET comparisions with a clean SVN install and CAN

This was tested for ORCHIDEE-CN-CAN (r5678 of ORCHIDEE and r5673 of ORCHIDEE_OL) on obelix.

Background

First, look at Nicolas's page.

http://forge.ipsl.jussieu.fr/orchidee/wiki/Scripts/FluxnetValidation

And then look at the README file in config/ORCHIDEE_OL/ENSEMBLE. And then read this whole page before really starting to create a run.

Be sure you have checked out both CN-CAN modeles/ORCHIDEE and config/ORCHIDEE_OL (Documentation/UserGuide/ORCHIDEEDOFOCOInstall).

Be sure that ioipsl_debug=.FALSE. in modeles/IOIPSL/src/errioipsl.f90. Otherwise, the output files become huge because of the high frequency writes combined with the debug information.

Start from a clean SVN ENSEMBLE install. Notice that ENSEMBLE/Job_ENSEMBLE is the main driver, and it should not be deleted! This is what I refer to when I say "Nicolas's FLUXNET scripts". It will create jobs based on SPINUP/SUBJOB/OOL_SEC_STO/.

For the following, I refer to config/ORCHIDEE_OL/SPINUP as SPINUP and config/ORCHIDEE_OL/ENSEMBLE as FLUXNET, assuming you have copied the whole ENSEMBLE directory to a new directory called FLUXNET to do the run. This notation is necessary as these scripts make use of several directories in config/ORCHIDEE_OL, including subdirectories.

The general procedure is that the Job_ENSEMBLE script will create and launch the following runs for every site:

  • STOI (a fast spinup, length controlled by duree_inistomate)
  • ORC-1 (a longer spinup, length controlled by duree_sechiba)
  • CLEARCUT (added for CAN...aboveground biomass is removed before the run, to ensure forests have a specific age at the end...length controlled by the fifth field in the site description in fluxnet.card)
  • FIN (the final fast production run, length controlled by duree_final)

The length of these phases can be modified, and additional longer spinups can be added (by changing n_iter, creating ORC-2, ORC-3, etc.), but they are typically not necessary. The final production data (from FIN) is always saved, and output from the other stages can be saved as well, but it's not recommended. In particular, the data for the ORC run can get pretty large when half-hourly output is used.

I have found the following files are used. A lot of the work below goes to ensure that conflicting options are not specified in these files. OOL_SEC_STO is selected when you are run both stomate and sechiba active (otherwise, OOL_SEC may be used):

FLUXNET/fluxnet.card
FLUXNET/PARAM/run.def
FLUXNET/PARAM/orchidee.def    (only for CAN)
FLUXNET/PARAM/orchidee_pft.def_*  (only for CAN)
SPINUP/COMP/spinup.card
SPINUP/SUBJOBS/OOL_SEC_STO/COMP/sechiba.card
SPINUP/SUBJOBS/OOL_SEC_STO/COMP/stomate.card
SPINUP/SUBJOBS/OOL_SEC_STO/PARAM/run.def
SPINUP/SUBJOBS/OOL_SEC_STO/PARAM/orchidee.def  (only for CAN)
SPINUP/SUBJOBS/OOL_SEC_STO/PARAM/orchidee_pft.def_*  (only for CAN)

The following gives a general flow of how the script works, which should give an idea of the priority. 1) ENSEMBLE_initialize/ensemble.ksh reads in the options from fluxnet.card (launching the script as "./Job_ENSEMBLE fluxnet" copies fluxnet.card to ensemble.card...Job_ENSEMBLE always reads all options from ensemble.card). These are global variables while the runs are being set-up. But, when the individual runs (each stage/site) launch, they may get overwritten by PARAM and COMP options in the launch directory. The following sections are parsed in fluxnet.card:

a) Section [SPINUP] b) Section [UserChoices?] c) Section [CONFIG] (explicitly looks for ForcingPath?, NbPFTs, NbSitesParam?, NameSitesParam?, Groups)

The JOB_ENSEMBLE script and the spinup.driver create multiple new submission directories, using the SPINUP and SPINUP/SUBJOBS directories as templates. I use the notation ${}/PARAM/run.def and ${}/COMP/*card to refer to the run.def and various .card files after they have been copied from their original locations.

The following is then done for every site in Groups in fluxnet.card: 2) The directory is created for the spinup (spinup.card is taken from SPINUP/) 3) spinup.card is modified based on the Job_ENSEMBLE script, including adding UserChoices? from fluxnet.card 4) the Job_ file for the spinup is modified 5) The FLUXNET/PARAM/run.def is copied to the spinup directory ${}/PARAM/run.def 6) The script checks that all options which appear in fluxnet.card[SubJobParams?] also appear in FLUXNET/PARAM/run.def 7) The script writes all of these options into the ${}/COMP/spinup.card and ${}/PARAM/run.def 8) The script checks to make sure that the NbSitesParam? variables in the fluxnet.card appear in the ${}/PARAM/run.def 9) The script writes all of these options into the ${}/PARAM/run.def 10) The spinup job is launched

From here, the Job_ENSEMBLE script is not used anymore. Now the independent spinup jobs control the show. Things are a little more difficult to follow here unless you are really used to libIGCM. libIGCM does a very good job of generalizing functions, but that can make it more difficult to find what you are looking for. It helps to remember that each file name in the COMP directory is a "component", and libIGCM treats them all identically: initilizing with IGCM_comp_Initialize, for example. As defined in SPINUP/config.card, we only have a single component in a spinup job: SPIN.

A spinup job runs like the following. Notice that all of this is carried out in ${}, which is a new directory the above script has created in the FLUXNET directory (e.g., FLUXNET/FI_HyyFLUXNET), NOT in the SPINUP directory. 11) IGCM_comp_Initialize/libIGCM_comp.ksh reads in the UserChoices? from ${}/COMP/spinup.card, which was created in step (2) above and modified in steps (3) and (7) 12) IGCM_comp_Update/libIGCM_comp.ksh calls SPIN_Update from ${}/COMP/spinup.driver, which determines we are on the "start" stage and therefore need to follow STOI instructions for the next step (referred to as SECSTOINI inside spinup.driver) 13) ${}/COMP/spinup.driver copies ${}/SUBJOB/OOL_SEC_STO to the STOI directory 14) ${}/COMP/spinup.driver forces the value of FOREST_MANAGED_FORCED in ${}/STOI/PARAM/run.def. Other values are taken from UserChoices? in ${}/COMP/spinup.card. 15) SPIN_prepare from ${}/COMP/spinup.driver sets the values of several variables in ${}/STOI/COMP/sechiba.card, based on UserChoices? in ${}/COMP/spinup.card. The same for ${}/STOI/COMP/orchidee_ol.card 16) SPIN_OptionsSechiba from ${}/COMP/spinup.driver sets the values of TimeSeriesVars3D and TimeSeriesVars2D in ${}/STOI/COMP/sechiba.card. 17) SPIN_OptionsStomate from ${}/COMP/spinup.driver sets the values of SPINUP_ANALYTIC, TimeSeriesVars3D and TimeSeriesVars2D in ${}/STOI/COMP/sechiba.card, as well as the values of FORCESOIL_STEP_PER_YEAR, STOMATE_FORCING_NAME, and STOMATE_CFORCING_NAME in ${}/STOI/PARAM/run.def. 18) Execute the STOI run

Sometimes, the scripts replace variables by values found in the various .card and .def files. Othertimes, the variables are added onto the end. This distinction is important if you get an error saying that a variable appears multiple times.

Practical steps

This section gives step-by-step instructions for getting the simulations working with a clean svn install of ORCHIDEE-CAN r6414. The "General steps" section gives an idea of how to figure things out yourself.

First, duplicate the ENSEMBLE directory to a name of your own choosing (I choose the name FLUXNET here, to match the previous section).

cd config/ORCHIDEE_OL
cp -r ENSEMBLE FLUXNET
cd FLUXNET

As of r6358 (much before, actually, but at least this revision), the CAN branch of config/ORCHIDEE_OL/ENSEMBLE contains a series of fluxnet*card files. These different files have different configurations, and different sites. Choose one that best matches what you want.

cp fluxnet_28sp.card fluxnet.card

Make sure the following line exists in the [UserChoices?] section of new fluxnet.card (not all of the fluxnet files in the ENSEMBLE directory have them, but it causes a problem if this flag is turned on):

CRUP=n

As of around r6358, the Python script in the config/ORCHIDEE_OL/MAKE_RUN_DEF folder started generating only orchidee_pft.def_* in a few directories: OOL_SEC_STO_FG1trans, OOL_SEC_STO_FG2, SPINUP, and some others. You should make sure that your PARAM directory has all the run.defs it needs, as for a normal run.

cd ../MAKE_RUN_DEF/
module load python/2.7 (on obelix)
python Make_orchidee_pft_defs.py
cp ../OOL_SEC_STO_FG2/PARAM/* ../FLUXNET/PARAM/
cp ../OOL_SEC_STO_FG2/PARAM/* ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/

I have noticed that the script will complain if a value is specified in fluxnet.card but not the run.def. It will not complain if a value is specified in run.def and not fluxnet.card. Check the [UserChoices?] and [SubJobParams?] sections of fluxnet.card. Many of the UserChoices? are already in SPINUP/COMP/spinup.card, and many of the SubJobParams? are in the run.def. It seems that the scripts make decisions based on what is in fluxnet.card, so this should typically take precedence. I will point out the exact changes I make for r6414 below.

Before we get to some specifics, let's create the jobs.

cd ../FLUXNET
vi config.card

Change the following lines (on obelix...on Irene, the ARCHIVE line should be fine):

JobName=FLUXNET
  ARCHIVE=/home/scratch01/$LOGIN

then create the job scripts

../../../libIGCM/ins_job

this creates Job_FLUXNET. Notice that this job will pull from the SPINUP directory as well. ins_job used to create Job files in every directory, but that functionally changed a while ago. Therefore, the following is now necessary (OOL_SEC_STO because we will run a job with sechiba and stomate).

cd ../SPINUP
../../../libIGCM/ins_job
cd SUBJOB/OOL_SEC_STO/
../../../../../libIGCM/ins_job
cd ../../../FLUXNET

Now edit the Job_FLUXNET file. Notice that this is the Job file that is copied to all the subjobs when they run, so if you want them to run on a different queue (I use the long queue on obelix, as 500 years can take more than 12 hours), you should do that here. I also modify the run directory so I know where the jobs are running and can go to that directory easily if needed.

vi Job_FLUXNET
(make the queue "medium" instead of "mediump" on obelix: #PBS -q mediump)
(change RUN_DIR_PATH=/home/scratch01/mmcgrath/RUN_DIR)
(change JobType=DEV if you are not sure this will work)
mkdir /home/scratch01/mmcgrath/RUN_DIR

Now change the options for the sites to run against.

vi fluxnet.card

Best to run a small test with a single site. Based on the flunxet.card you copied earlier, the number of species and age classes should all be set up fine.

Always launch a test run before doing a production run, i.e. a single site. The Job_ENSEMBLE sript will launch a full spinup job for every site in every group in the fluxnet.card. To limit this to one, do something like the following:

Groups= ( test )

test = 	( NL-Loo , NL-Loo_1996-2006.nc , 1996, 11, 80  ,0,0,0,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0.055555556,0,0,0,0,0,0,0 ) 

COMMENT OUT ANY OTHER LINES THAT BEGIN WITH "Groups". Else, when you submit the job, you will launch a run over all of the sites in all groups, and you have to cancel them one at a time. From experience, this is painful.

The length of the spinup also matters. I use the following for production runs at the moment (in fluxnet.card...I also change the values in SPINUP/COMP/spinup.card, even though those should be overwritten by Job_FLUXNET)

n_iter=1
duree_inistomate=1
duree_sechiba=500
duree_final=1

(I use duree_sechiba=50 for my test run, so that it goes a little faster). All of the other duree values I set to 0. This launches a simulation over one loop of the forcing file, then 500 years (regardless of the length of the forcing file), and then one final loop for analysis.

The section in the fluxnet.card with [SubJobParams?] deserves special mention. As of a recent version of CAN, the run.def has been restructured to include two files: orchidee.def, orchidee_pft.def. This makes the run.def much neater and matches what is done in the coupled simulations. However, the Job_ENSEMBLE script attempts to change some variables in the run.def that fall under the [SubJobParams?] section. To do this, it looks at the actual run.def file, not any included file. If it does not find a line in the run.def corresponding to the lines in [SubJobParams?], it will crash. So make sure all the lines you specify under [SubJobParams?] in fluxnet.card also explicitly appear in the PARAM/run.def file.

vi PARAM/run.def
(add the following lines from [SubJobParams] in fluxnet.card)
ALMA_OUTPUT=y
SECHIBA_reset_time=y
SPLIT_DT=1
SPINUP_ANALYTIC=y
NBUFF=0
STOMATE_FORCING_NAME=NONE
STOMATE_CFORCING_NAME=NONE
FIRE_DISABLE=y
# ATM_CO2=368 : value for year 2000
ATM_CO2=368
XIOS_ORCHIDEE_OK=n


Nammonium_FILE = CCMI_ndep_nhx_2000.nc
Nnitrate_FILE = CCMI_ndep_noy_2000.nc
Nammonium_VAR = nhx
Nnitrate_VAR = noy

Nfert_FILE = NONE
Nfert_VAR = nfer

Nmanure_FILE = NONE
Nmanure_VAR = Nmanure
Nfert_cropland_FILE =  Nfer_cropland_2000.nc
Nfert_cropland_VAR = nfer
Nmanure_cropland_FILE = Nmanure_cropland_2000.nc
Nmanure_cropland_VAR = Nmanure

Nfert_pasture_FILE = Nfer_pasture_2000.nc
Nfert_pasture_VAR = Nfer
Nmanure_pasture_FILE = Nmanure_pasture_2000.nc
Nmanure_pasture_VAR = Nmanure

Nbnf_FILE= bnf_1850.nc
Nbnf_VAR= BNF_MGN_PERM2_PERYR

NINPUT_UPDATE=0Y
NINPUT_SUFFIX_YEAR = n

(make sure the following lines are commented out, otherwise ORCHIDEE will not find a land point for any site outside of this window)
LIMIT_WEST=8
LIMIT_NORTH=48
LIMIT_SOUTH=46
LIMIT_EAST=10

Note that we did not copy SPINUP_PERIOD. This is because it uses a variable that is evaluated during the execution of Job_ENSEMBLE, and therefore we let the script copy the value onto the end of the run.def.

Note that many of the Nitrogen variables above were also in PARAM/orchidee.def! Remove the following from PARAM/orchidee.def:

Nammonium_FILE = ndep_nhx.nc
Nammonium_VAR = nhx

Nnitrate_FILE = ndep_noy.nc
Nnitrate_VAR = noy

Nfert_FILE = NONE
Nfert_VAR = nfer

Nmanure_FILE = NONE
Nmanure_VAR = Nmanure

Nfert_cropland_FILE = nfert_cropland.nc
Nfert_cropland_VAR = nfer

Nmanure_cropland_FILE = nmanure_cropland.nc
Nmanure_cropland_VAR = Nmanure

Nfert_pasture_FILE = nfert_pasture.nc
Nfert_pasture_VAR = Nfer

Nmanure_pasture_FILE = nmanure_pasture.nc
Nmanure_pasture_VAR = Nmanure

Nbnf_FILE= bnf.nc
Nbnf_VAR= BNF_MGN_PERM2_PERYR

Also remove the ATM_CO2 that was already existing in the PARAM/run.def:

ATM_CO2 = _AUTO_: DEFAULT = 350.

For FLUXNET jobs, we generally impose vegetation at the site. While this is set in fluxnet.card in the UserChoices?, this doesn't seem to get passed to the run.def in the spinup unless we also place it in the run.def.

IMPOSE_VEG=y

The addition of the orchidee.def and orchidee_pft.def required adding them to the [ParametersFiles?] in SPINUP/SUBJOBS/OOL_SEC_STO/COMP/orchidee_ol.card, so that libIGCM copies the new files to the PARAM directory of the running code. It also required changes to the driver, to select from the correct orchidee_pft.def file. To fix this, I simply copied OOL_SEC_STO_FG2/COMP/orchidee_ol.* to SPINUP/SUBJOB/OOL_SEC_STO/COMP/.

cp ../OOL_SEC_STO_FG2/COMP/orchidee_ol.* ../SPINUP/SUBJOB/OOL_SEC_STO/COMP/

This also required adding the following to the [UserChoices?] section in SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card, since the SPINUP/COMP/spinup.driver looks for them:

vi ../SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card
(add the following two lines to [UserChoices] section)
NORESTART=n
TIMELENGTH=y

Notice that the SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card defines the age classes and PFTs that you will be using. For the moment, we have selected our fluxnet.card to have a certain number of PFTs and age classes, but we have not conveyed this choice to libIGCM in any way. We do that by changing the SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card value of DefSuffix?:

vi ../SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card
DefSuffix = 28pft.1ac

Make sure this matches with the fluxnet.card that you copied at the beginning!

The script adds any variables in the NameSitesParam? keyword of fluxnet.card in the PARAM/run.def. SECHIBA_VEGMAX is currently in PARAM/orchidee_pft.def_*. So, depending on what you have present in SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol.card for DefSuffix?, you need to remove the following lines in PARAM/orchidee_pft.def_DefSuffix, and the Job_ENSEMBLE script will add them to the end of the run.def as it copies it around. The specific case of 28 PFTs that we are using here:

emacs PARAM/orchidee_pft.def_28pft.1ac &

(remove the following)
SECHIBA_VEGMAX__01=0.0357142857143
SECHIBA_VEGMAX__02=0.0357142857143
SECHIBA_VEGMAX__03=0.0357142857143
SECHIBA_VEGMAX__04=0.0357142857143
SECHIBA_VEGMAX__05=0.0357142857143
SECHIBA_VEGMAX__06=0.0357142857143
SECHIBA_VEGMAX__07=0.0357142857143
SECHIBA_VEGMAX__08=0.0357142857143
SECHIBA_VEGMAX__09=0.0357142857143
SECHIBA_VEGMAX__10=0.0357142857143
SECHIBA_VEGMAX__11=0.0357142857143
SECHIBA_VEGMAX__12=0.0357142857143
SECHIBA_VEGMAX__13=0.0357142857143
SECHIBA_VEGMAX__14=0.0357142857143
SECHIBA_VEGMAX__15=0.0357142857143
SECHIBA_VEGMAX__16=0.0357142857143
SECHIBA_VEGMAX__17=0.0357142857143
SECHIBA_VEGMAX__18=0.0357142857143
SECHIBA_VEGMAX__19=0.0357142857143
SECHIBA_VEGMAX__20=0.0357142857143
SECHIBA_VEGMAX__21=0.0357142857143
SECHIBA_VEGMAX__22=0.0357142857143
SECHIBA_VEGMAX__23=0.0357142857143
SECHIBA_VEGMAX__24=0.0357142857143
SECHIBA_VEGMAX__25=0.0357142857143
SECHIBA_VEGMAX__26=0.0357142857143
SECHIBA_VEGMAX__27=0.0357142857143
SECHIBA_VEGMAX__28=0.0357142857143

cp PARAM/*def ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/

I noticed that the names of the following filenames did not match what is written in the [BoundaryFiles?] of SPINUP/SUBJOB/OOL_SEC_STO/COMP/stomate.card file, which will cause problems later. Make sure the filenames in the run.def/flunxet.card/stomate.card all match, and then copy PARAM/*def to SPINUP/SUBJOB/OOL_SEC_STO/PARAM/.

emacs PARAM/run.def &
emacs fluxnet.card &
emacs ../SPINUP/SUBJOB/OOL_SEC_STO/COMP/stomate.card &

(change the following in the fluxnet.card, and copy to the run.def...stomate.card should be okay, but check)

Nammonium_FILE = ndep_nhx.nc
Nnitrate_FILE = ndep_noy.nc
Nfert_FILE = NONE
Nmanure_FILE = NONE
Nfert_cropland_FILE = nfert_cropland.nc
Nmanure_cropland_FILE = nmanure_cropland.nc
Nfert_pasture_FILE = nfert_pasture.nc
Nmanure_pasture_FILE = nmanure_pasture.nc
Nbnf_FILE= bnf.nc

(now copy the files)
cp PARAM/*def ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/

Similarly, values found in fluxnet.card [UserChoices?] seem to be required in SPINUP/COMP/spinup.card, else it crashes. So, assuming that you have made the correct choices in fluxnet.card, just copy the whole [UserChoices?] section to the spinup.card.

emacs fluxnet.card &
emacs ../SPINUP/COMP/spinup.card &

(cp all the [UserChoices] variables, making sure none are repeated...I noticed the following had to be added, and the existing values deleted)

CRUP=n
ok_newhydrol=y
impose_veg=y
land_use=n
level_hist=5

Some additional variables which need to be in run.def and not orchidee.def (anything with _AUTO_ or _AUTOBLOCKER_ after it, since the .card files look to run.def to change these values, and they don't look into the included files):

emacs PARAM/orchidee.def &
emacs PARAM/run.def &

(make sure the following are in run.def and not in orchidee.def)

SECHIBA_restart_in = _AUTOBLOCKER_
STOMATE_RESTART_FILEIN = _AUTOBLOCKER_
XIOS_ORCHIDEE_OK = _AUTOBLOCKER_
SECHIBA_HISTFILE2 = _AUTO_
WRITE_STEP = _AUTO_
WRITE_STEP2 = _AUTO_ 
STOMATE_HIST_DT = _AUTO_
STOMATE_IPCC_HIST_DT = _AUTO_
RIVER_DESC = _AUTO_
VEGET_UPDATE = _AUTO_
STOMATE_OK_STOMATE = _AUTOBLOCKER_ 
NINPUT_UPDATE = _AUTO_
STOMATE_IMPOSE_CN = _AUTO_

(remove the following from orchidee.def)
impose_veg=n

(make sure the following to PARAM/run.def.  Also make sure it is all capitals!)
IMPOSE_VEG=y

(now copy the files)
cp PARAM/*def ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/

Note that we can not use the analytical spinup at the present (the value is changed in the next step). In order to use the analytical spinup, we need to make sure CyclicBegin? and CyclicEnd? appear in the ${}/STOI/config.card, as ${}/STOI/COMP/stomate.card checks for these values. I have not yet figured out how to do that.

Some variables appear in fluxnet.card, but they are also special variables having an AUTO value in orchidee.def (that we moved to param.def). Therefore, remove the following lines from FLUXNET/fluxnet.card and PARAM/run.def (if they exist). We remove the _AUTO_ value of SPINUP_PERIOD since we attempt to calculate that in an automatic way using a different variable in Job_ENSEMBLE, as opposed to in SPINUP/COMP/spinup.driver.

NINPUT_UPDATE=0Y
SPINUP_ANALYTIC=y
SPINUP_PERIOD = _AUTO_
XIOS_ORCHIDEE_OK=n

(now copy the file)
cp PARAM/run.def ../SPINUP/SUBJOB/OOL_SEC_STO/PARAM/

Launch the job (from the README file).

   ./Job_ENSEMBLE fluxnet > out.Job_ENSEMBLE

BE SURE TO CHECK THE USED RUN.DEFs. These can be found by changing to the RUN_DIR when the job is running. The scripts will add flags to the end of the run.def, and sometimes these may conflict with what you want to run.

General steps

If you are more interested in understanding what is going on, if you are using a version of ORCHIDEE not used in the "Practical steps" section, or if the steps in the "Practical steps" section didn't work for you, this section provides general guidance on how to get things up and running. It is completmented by the "Debugging" section below.

A good first test is to see if you can get a SPINUP job working. In other words

In the end, the run.def that gets placed in the run directory is the most important input file, and everything is just processing to get it there. If you end up with a crash of your run and a FLUXNET/FI-HyyFLUXNET/STOI/Debug, this likely means something is wrong in your input file. Find the run directory, and open up the run.def to check that all values have been properly replaced by the libIGCM tools. For example...

grep Cd FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001 

gives the run directory of

IGCM_sys_Cd : /home/scratch01/mmcgrath/RUN_DIR/FI-HyyFLUXNETSTOI.1820

Opening that with vi or emacs shows the following line:

SPINUP_PERIOD='${TIME_YEAR}'

As SPINUP_PERIOD should be an integer, the libIGCM scripts (notably the .card and .driver files) are not properly finding and replacing this value. Searching for SPINUP_PERIOD in the current directory shows two things: that this text is present in the PARAM/run.def and fluxnet.card, and that libIGCM tried to set the value, but failed.

From the file out.Job_ENSEMBLE

For parameter file run.def
SPINUP_PERIOD=${TIME_YEAR}
2019-12-02 14:56:26 --------Debug2--> ORCHIDEE : SPINUP_PERIOD has already been set in def file.
2019-12-02 14:56:26 --------Debug2--> default value : -1
2019-12-02 14:56:26 --------Debug2--> ORCHIDEE : SPINUP_PERIOD has already been set in run.def file.
2019-12-02 14:56:26 --------Debug2--> default value : -1
2019-12-02 14:56:26 --------Debug2--> script value : 11
2019-12-02 14:56:26 --------Debug2--> USER value : '${TIME_YEAR}'
2019-12-02 14:56:26 --------Debug2--> We will NOT set in again !

What should the value be? Search for the variable in the SPINUP directory.

grep -ir SPINUP_PERIOD ../SPINUP/*

This shows that the value is set in SPINUP/SUBJOB/OOL_SEC_STO/COMP/stomate.driver. Searching the directories for TIME_YEAR shows that this variable is defined in Job_ENSEMBLE.

Debugging

These are some of the errors that I have run into, along with attempts at explaining why and where they may occur, and how to solve them.

Error files can be found in many places, including (assuming a job name of FLUXNET and a site of FI-Hyy):

FLUXNET/out.Job_ENSEMBLE
FLUXNET/FI-HyyFLUXNET/out_qsub_FI-HyyFLUXNET
FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001
FLUXNET/FI-HyyFLUXNET/STOI/Debug

In my experience, errors come from the following places:

'''FLUXNET/out.Job_ENSEMBLE''': PARAM/run.def
'''FLUXNET/FI-HyyFLUXNET/out_qsub_FI-HyyFLUXNET''': SPINUP/SUBJOB/OOL_SEC_STO/COMP/*card
'''FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001''': SPINUP/SUBJOB/OOL_SEC_STO/COMP/*card, SPINUP/SUBJOB/OOL_SEC_STO/COMP/*driver, PARAM/run.def, fluxnet.card
'''FLUXNET/FI-HyyFLUXNET/STOI/Debug''': SPINUP/SUBJOB/OOL_SEC_STO/COMP/*card, SPINUP/SUBJOB/OOL_SEC_STO/COMP/*driver, PARAM/run.def, or the ORCHIDEE model itself

I would recommend solving the "deepest" error first (e.g., fix an error in the STOI directory before trying to fix an error in out_qsub_FI-HyyFLUXNET).

Here are some errors:

In the file FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001

IGCM_debug_Exit :  IGCM_comp_modifyDefFile : The variable XIOS_ORCHIDEE_OK cannot be modified. It should be set to AUTO.

One solution is to modify the file SPINUP/SUBJOB/OOL_SEC_STO/COMP/sechiba.driver such that the following two lines

      IGCM_comp_modifyDefFile blocker run.def XIOS_ORCHIDEE_OK y
      ...
      IGCM_comp_modifyDefFile blocker run.def XIOS_ORCHIDEE_OK n

become

      IGCM_comp_modifyDefFile force run.def XIOS_ORCHIDEE_OK y
      ...
      IGCM_comp_modifyDefFile force run.def XIOS_ORCHIDEE_OK n

If you do this, the value of the variable will be overwritten, so you should confirm that all values which trigger this option (in this case, XIOS=y and XIOS_ORCHIDEE_OK=y) are set to match what you want. In this case, the XIOS value was found in SPINUP/SUBJOB/OOL_SEC_STO/COMP/orchidee_ol_card, PARAM/run.def,fluxnet.card).

Another error that is found:

In the file FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001

IGCM_debug_Exit :  IGCM_comp_modifyDefFile : Variable STOMATE_OK_STOMATE is not set in correct file. It should be set in run.def.

This is generally a sign that a variable is in PARAM/orchidee.def and it needs to be in PARAM/run.def because libIGCM is trying to modify it, and libIGCM only knows to modify run.def at the moment. You will need to do the same to SPINUP/SUBJOB/OOL_SEC_STO/PARAM/*def.

Another error:

In the file FLUXNET/FI-HyyFLUXNET/STOI/Script_Output_FI-HyyFLUXNETSTOI.000001

IGCM_debug_Exit :  IGCM_comp_modifyDefFile : Error in run.def: Variable=NINPUT_UPDATE is set 2 times

Generally means that a value appears in both PARAM/run.def (likely copied there from fluxnet.card) and PARAM/orchidee.def. Need to delete the line in PARAM/orchidee.def, and then copy the whole PARAM directory to SPINUP/SUBJOB/OOL_SEC_STO/PARAM/.

Another error:

In the file FI-HyyFLUXNET/out_qsub_FI-HyyFLUXNET

2019-12-02 13:26:41 --Debug1--> Check coherence between SeasonalFrequency and PeriodLength
2019-12-02 13:26:41 --------Debug2--> IGCM_post_CheckModuloFrequency : Master=10Y Slave=11Y
2019-12-02 13:26:41 --Debug1--> config_UserChoices_PeriodLength frequency 11Y not compatbile with
2019-12-02 13:26:41 --Debug1--> config_Post_SeasonalFrequency frequency : 10Y
IGCM_debug_Exit :  Check your frequency

The Job_ENSEMBLE script takes information from a variety of sources. In this case, it appears to take information from the SPINUP and SPINUP/SUBJOBS/OOL_SEC_STO directories. The script attempts to change the timeseires write frequency to match that of the FLUXNET forcing data file length (11 years in this case), but we had left the following line in SPINUP/config.card

TimeSeriesFrequency=10Y

which leads to libIGCM getting confused. The solution is to replace "10Y" in the line above with "NONE". You must then go into the config/ORCHIDEE_OL/SPINUP directory, remove the Job_JOBNAME file, and redo the ../../../libIGCM/ins_job command to create a new Job_JOBNAME file.

Another error:

IGCM_debug_Exit :  IGCM_comp_modifyDefFile : Error in run.def: Variable=STOMATE_HIST_DT is set 2 times

Either you added a variable to FLUXNET/PARAM/run.def and forgot to delete it from FLUXNET/PARAM/orchidee.def or FLUXNET/PARAM/orchidee_pft.def_28pft.1ac, OR the variable appears in FLUXNET/PARAM/*def and it gets added to the run.def by the ensemble (doesn't appear to be possible, even if there is a line written in the run.def indicating it happens...the script crashes if the option doesn't already exist) or spinup (SubJobParams? in ${}/COMP/spinup.card or ${}/COMP/spinup.driver).

Cleaning

If an ENSEMBLE run crashes, it can sometimes be difficult to clean up all the files so that you can easily relaunch the run after figuring out what went wrong. In particular, each site creates a new directory, which can add up to a lot of directories. It's possible that some of your runs overlap, too (i.e., they use the same base directory, but the current run only uses forested sites, while a different run used agricultural sites). There may be a libIGCM tool that does this well, but if you aren't familiar with it, here is a short script that works. Copy it to your submission directory (i.e., where you launch the ./Job_ENSEMBLE script), make it executable (e.g., chmod +x clean.sh), and launch it before re-launching the run (e.g., ./clean.sh).

#!/usr/bin/bash
simulation="FLUXNET"
basedir="/home/scratch01/mmcgrath/IGCM_OUT/OL2/PROD/ensemble/"
sites=( FI-Hyy FI-Sod )

for site in "${sites[@]}"
do
    rm -fr ${site}${simulation} 
    rm -fr ${basedir}${site}${simulation}*
    rm -fr ${basedir}${site}${simulation}*
    rm -fr ${basedir}${simulation}/${site}${simulation}*
    rm -fr ${basedir}${simulation}/${site}${simulation}*
    echo "$simulation $site"
done

rm -fr out.job_ensemble

All you need to do is modify the site list, basedir and simulation variables for your particular run.

Speed

Some timing tests were carried out with TAG2.1, TRUNK (r6096), and CAN (r6091) on obelix. This revealed the importance of the NBUFF=0 keyword for running with FLUXNET data for a single site. When running for a single site with forcing that has lower temporal resolution (e.g., CRUNCEP, which has six-hourly resolution instead of the 30 min resolution of FLUXNET), it's much less important. The amount of data output for all runs was adjusted to give approximately the same size of files. The optimized executables were used for all tests (-O3).

I take timings from four locations: CPU Time Global and Real Time Global from out\_orchidee, and then the real and user times reported by time -p ./orhcidee\_ol. For the most part, they are similar. For clairity, I only report Real time Global from out\_orchidee below. Error bars are the standard deviation from 5 independent runs to show the variance.

The TRUNK and TAG21 have 15 PFTs, CAN has 28 PFTs, but they are all set to zero except for NeedleleafEvergreenTemperate? (PFT4...4), Deciduous temperate (PFT6...8), C3Grass (PFT10..23), and C3Crop (PFT12...26), which are all set to 0.25. I wanted to simulate a somewhat realistic pixel with a mix of vegetation.

Using NBUFF=1

FLUXNET forcing, XIOS, half-hour sechiba history, one day stomate history, 5 years, no libIGCM, the total time is [in seconds, with standard deviation from five runs on obelix]

TAG21       1270 $\pm$ 60
TRUNK       4600 $\pm$ 600
CAN         5800 $\pm$ 500

CRUNCEP forcing, XIOS, half-hour sechiba history, one day stomate history, 5 years, no libIGCM

TAG21       1310 $\pm$ 70
TRUNK       1700 $\pm$ 100
CAN         1810 $\pm$ 90

Using NBUFF=0

FLUXNET forcing, XIOS, half-hour sechiba history, one day stomate history, 5 years, no libIGCM, the total time is [in seconds, with standard deviation from five runs on obelix]

TAG21       1250 $\pm$ 140
TRUNK       1480 $\pm$ 90
CAN         1700 $\pm$ 200

CRUNCEP forcing, XIOS, half-hour sechiba history, one day stomate history, 5 years, no libIGCM

TAG21       1310 $\pm$ 50
TRUNK       1440 $\pm$ 110
CAN         1700 $\pm$ 200