wiki:Documentation/UserGuide/CheckList

Version 37 (modified by luyssaert, 4 years ago) (diff)

--

Step by step guidelines to prepare your simulation: configuration, input files, keywords, etc.

Author: A. Ducharne in 2019/01/22
With contributions from J. Ghattas, P. Peylin, F. Maignan, M. McGrath, B. Guenet (April 2020)
Last revision: P. Arboleda (May 2020), S. Luyssaert (June 2020)

Objective

The objective of this item is to sum up what you remember when running a simulation. The item is written for students and even for more experienced users. The examples of this page are oriented towards offline simulations with libIGCM ("infrastructure développée à l'IPSL pour accéder, lancer et enchaîner simulation et post-traitements pour une simulation type"). Links to particular files are given for tag2.0 (created in March 2018 for CMIPv1). This checklist is largely valid for other model configurations such as LMDZOR or when running without making use of libIGCM. In those cases the details will most likley not hold.

1) On which machine do I want to run?

1a) On a machine maintained by the IPSL team

The big advantage of using one of the machines maintained by the IPSL team is that you will have access to all essential libraries (i.e. XIOS) and data files (different climate forcing files, different PFT maps, different soil maps, etc.). Such machines are at TGCC (irene), LSCE (obelix), IDRIS (jean-zay), IPSL mesocentre (ciclad/climserv), anywhere libIGCM is maintained by the IPSL team. You first need to obtain a login and for this restrictions apply. At TGCC and IDRIS, you need to be part of an existing project and working in France to be eligible to obtain a login. You will also have to contribute to the annual request for computing resources and the annual report of consumed cpu time for the previous year.

Once you obtained a login for any of the aforementioned serves, you must set up the proper environment (paths, modules) before you will be able to correctly install the model. Read more here, which covers all the standard platforms supported by the IPSL: https://forge.ipsl.jussieu.fr/igcmg_doc/wiki/DocBenv

1b) Else you're pretty much on you own

You can also run on your local linux machine, but in this case, you will have to manage your running environment yourself: libraries, compiler, etc. Find some information here, in addition to some of the links below (Installation and compilation). Please note that if you choose this option you may receive less support from the orchidee group. Except for specific reason this option is not recommended.

2) Which code do I want to run?

2a) Choosing a revision

The best way is to use svn to download a referenced version of the code (trunk, tag, branch, perso). By using SVN, you can easily change between model versions (more information on SVN is found in this PDF), as well as incorporate bug fixes and updates from others. In this case, each version is referenced by a revision number, which corresponds to different source code, but also to potentially different "keywords" and their default values (more in 4). Therefore, only tests between identical revisions can be expected to be identical.

Some revisions are "tagged" (e.g., ORCHIDEE 2.0). A tagged revision is considered stable and does not change. Other revisions are likely undergoing continual development and modification, with occasional commits breaking them. Branches are "forks" from the main code (referred to as the "trunk"), in which developers are adding new features. It takes significant effort and knowledge of the code base to merge some of these features, which means not every feature is present in every revision of ORCHIDEE. Selecting a revision and branch should therefore take into account the research questions you are trying to address. These pages contain further information on the trunk, Tag2.0 as well as the branches.

The ORCHIDEE model is flexible, with many features controlled by input files. The model also requires additional information (e.g., spatial and temporal extents) and input data (often called, "forcing files," including meteorological forcing data) in order to run. The collection of input files that direct model behavior are referred to as the configuration. More information on configurations is found below.

ADVICE: If you want to run two versions of ORCHIDEE (e.g. a new development and the reference version it originates from), you are strongly advised to create two different directories (e.g. NEW/modipsl/... and REF/modipsl/...). It's also the case if you want to run two simulations with same code but for hard-coded parameters (but in this case, it is often better to take advantage of the "externalization" feature and define the parameters as keywords that can be defined in your PARAM/run.def, see 4a, 4d).

2b) Code options not decided by a revision

The code of ORCHIDEE includes many "code" options, i.e. options to execute some parts of the code or not. These options are controlled by some configuration files, mostly config.card, PARAM/run.def (for older versions of the model including Tag2.0) and PARAM/orchidee.def (for the trunk; see 4b). Since these options are not hard-coded, you can change them without recompiling. But they affect your results, and you need to know them to explain your model version.

These options depend in general on the branch of ORCHIDEE you are using. As a (non exhaustive) example, the following ones are available in the tagged 2.0 (which is the version used for CMIP6):

  • use the STOMATE module, owing to which vegetation grows and responds to climatic conditions (STOMATE_OK_STOMATE, defined in config.card or PARAM/run.def)
  • activate the dynamic global vegetation model (DGVM), where the fraction of a grid cell covered by vegetation changes with climatic conditions (STOMATE_OK_DGVM in PARAM/run.def)
  • use the old or new driver (for offline simulations, to deal with meteorological input files): the default is the old driver but, since tag2.1, a special configuration is proposed to use the new driver (see also 4a below). More details here.
  • activate dynamic nitrogen cycle, where nitrogen scarcity can limit vegetation growth (STOMATE_IMPOSE_CN in PARAM/run.def, default in the trunk since 3.0)
  • allow vegetation to burn in fires (FIRE_DISABLE in PARAM/run.def)
  • activate soil_freezing, etc... (more details on flags and keywords in 4c)

A rather comprehensive list of options that were added to the trunk since tag 2.0 can be found here

NOTE: not all of these components can be activated at the same time and still give reliable results. Besides, some options depend on some others (example). It's important to test the configuration you select.

2c) If you changed the code

Your developments must comply to ORCHIDEE's coding guidelines.

Your are advised to test your code as much as possible before running large simulations:

  • compile and run over a couple days in debug mode (for long simulations over large domains, debug mode in not advised as it make the execution much slower)
  • anticipate that a lot of time and efforts will be required to make the model run globally. A successful pixel run is by no means a valid indication that the model works at large scales. There are typically many exceptions that needs to be dealt with.
  • if your model crashes, turn to the Debug section of the documentation

If your model runs, check it runs correctly:

  • water, carbon, nitrogen and energy much be conserved, the model must be restartable (1+1=2) and the results must be identical irrespective of the number of processors that was used in the calculations.
  • think of sanity checks and ensure that the when the model is run without your changes, the results have not changed compared to the revision you based your developments on.
  • verify that your results make sense compared to the revision you based your developments on

3) Installation and compilation

Read here about how to download ORCHIDEE for offline (i.e., driven by separate meteorological forcing files) use: Documentation/UserGuide/InstallingORCHIDEEBasic. Information for downloading the CNP branch of ORCHIDEE also exists and can likely be adapted for any branch, but note that permission to access branches is generally restricted.

The same method is used for offline and coupled configurations, where "coupled" typically refers to coupling ORCHIDEE to an atmospheric model that generates meteorological data at every time step. Find the full documentation about installation and using of IPSL-cmc coupled models here, or go directly to a brief description of the LMDZOR_v6 configuration (LMDZ coupled to ORCHIDEE).

Always recompile your code if you make changes in the fortran code. Further information on compiling ORCHIDEE can be found here.

4) Configuration of your simulation

4a) The simplest option is to use predefined libIGCM configuration

Several of these configurations are used in the reference simulations, each corresponding to a different submission directory in modipsl/config/ORCHIDEE_OL/. A detailed description of each of these configurations is listed here

  • ENSEMBLE to run at FLUXNET sites
  • SPINUP_ANALYTIC_FG1, OOL_SEC_STO_FG1trans, OOL_SEC_STO_FG2, which are designed to be used one after the other to run a simulation with CRU-NCEP atmospheric forcing and the long spinup required for the carbon and nitrogen cycle (more details)
  • OOL_SEC_STO_FG3, for a run without spin-up (default restart state) under the WFDEI_GPCC atmospheric forcing
  • OOL_SEC_STO_FG3nd, for a run like FG3 but with the new driver developped by Jan Polcher (this configuration is available starting at tag2.1)
  • OOL_SEC_STO_FG4 and OOL_SEC_STO_FG5 (available since r6616) to use age classes, species-level PFTs, diameter classes, forest management, litter raking, species changes, management changes, etc
  • OOL_SEC, without STOMATE (no dynamic phenology nor nutrient cycles).

In the above names, "OOL" in a directory title refers to "ORCHIDEE offline", "SEC" refers to the SEChiba module in charge of the water and energy cycles, and "STO" refers to the STOmate model controlling vegetation and nutrient cycles. OOL_SEC_STO, therefore, means the configuration is designed to use meteorological forcing files to drive the water, energy, and nutrient cycles.

4b) To create your own configuration, the simplest way is to modify an existing configuration: copy the corresponding directory with a new name in your modipsl/config/ORCHIDEE_OL/ and change what you want:

  • config.card => to choose the name, length, restart of your simulation (more details)
  • PARAM/run.def, PARAM/orchidee.def and PARAM/orchidee_pft.def => Prior to r6616 the information contained in these three files was combined in a single file called PARAM/run.def. Since r6616 three different files have been defined to better match what is being done in LMDzOR. The concept and many details on these three files can be found here. The basic idea is that those files enable the users to choose the options of their simulation, via "keywords". These keywords can be a flag (y or n to activate them or not) or they can be parameters values that you can change without recompiling the model. The number of these keywords is huge, and it has a large effect on the outcome of your simulation. The default values depend on the revision (see 2). More details on run.def and see 4c for a list of keywords.
  • COMP/orchidee_ol.card => defines the meteorological forcing files and xml files (the latter of which are only used with XIOS; ORCHIDEE can also be run without XIOS, though it is not recommended). The list of available atmospheric forcing datasets is regularly updated here.
  • COMP/sechiba.card => defines the other input files (list of available datasets), and some important user option for sechiba:
    • do we change the vegetation map every year of not (VEGET_UPDATE)?
    • frequency of the output and name of the corresponding output files
    • the last part of the file is about post-processing
  • COMP/stomate.card => defines some important user option for stomate:

ADVICE: Create a new directory in your modipsl/config/ORCHIDEE_OL/ for each different run you want to keep in the end. It's the case for instance if you want to run one ORCHIDEE with different parameter sets using the keywords to define the parameters.

4c) More on keywords

The externalized parameters are parameters or flags that are not hard-coded and can be changed via input files, namely PARAM/run.def, PARAM/orchidee.def and PARAM/orchidee_pft.def (prior to r6616 these files were combined in a single files called PARAM/run.def). The advantage is that you can test various sets of parameters with the same executable and without recompiling. It can be particularly useful for parameter calibration.

These parameters are linked to "keywords", which appear in the code in capital letter and are fed with the parameter values by a "getin" call.

The list of the keywords related externalized parameters is different for different versions of the code:

WARNING: some options or parameters depend on other options. It's important to test the configuration you select. Example.

USEFUL TIP: after your simulation has run, you have a summary of externalized parameters and options in your output directory: SRF/Debug/xxx_used_ref.def

4d) Spinup and initialization

A simulation requires initial conditions, which are defined by section #D-- Restarts - in the config.card.

There are two options: Restart = n ("from scratch" with arbitrary initial values) vs. Restart = y (we use state variables from a pre-existing simulation as initial conditions; only works if the two simulations have the same horizontal and vertical resolution, and the same processes thus state variables). In both cases, we usually need some spinup, unless we are just pursuing a simulation. Read more on the ways to set up your spinup on Spinup : why, how and how long? The answer depends on the ORCHIDEE component. It is also possible to do an analytical spinup that is faster.

4e) Output

ORCHIDEE output is controlled by separate modules, referred to as XIOS ([here PDF] and webpage) and IOIPSL. These modules greatly improve reading and writing data for large simulations, although the benefits may not be seen for single pixel and thus a single CPU run.

If you want to change the variables that are output by your simulation, read the dedicated section History/output files.

5) Run your simulation

libIGCM is the environment that makes it easier to run simulations: gathering the required input files, copying them to the run directory, running the simulation, and then storing the output, in particular for multiple year simulations where you would otherwise have to manually copy restart files and name them appropriately. The libIGCM section of the HowTo has additional information about manipulating libIGCM in special cases.

It is possible to run simple offline test cases and larger parallel jobs without libIGCM. This can be useful when you are debugging your code.

Job, queues, etc. https://forge.ipsl.jussieu.fr/igcmg_doc/wiki/DocFsimu
LSCE specific: batch system & jobs submission https://intranet.lsce.ipsl.fr/informatique/en/calcul/batch.php

If your simulation crashes :

6) How to get some help?

6a) Look at the documentation and howto pages, training sessions, ...

6b) Ask your supervisor or close collaborators

6c) Email to orchidee-help at listes.ipsl.fr

7) Analyse your results

If your simulation(s) have run, you probably want to look at it(them)...

  • Model outputs are in netcdf format. You can find here some information on how to look at netcdf data
  • Basic checks (global mean values compared to reference simulations and observations; water conservation and twbr, etc.)
  • Some validation data have been collected and harmonized by the ORCHIDEE group, see more info on Documentation/Validation

8) How to archive your developments?

If your developments proved interesting, the ORCHIDEE community would be happy to benefit from them:

  • Presentation of your results (share your report or paper, invite the group to your seminar/defense, present a talk an ORCHIDEE weekly of DEV meeting)
  • Backup your development via svn on a branch or perso directory
  • For inclusion in the trunk, this has to be proposed to the ORCHIDEE-Project group, and a specific trusting protocol will be implemented