wiki:Documentation/UserGuide/CheckList

Version 33 (modified by aducharne, 4 years ago) (diff)

--

Step by step guidelines to prepare your simulation: configuration, input files, keywords, etc.

First Author: A. Ducharne in 2019/01/22
With contributions from J. Ghattas, P. Peylin, F. Maignan, M. MacGrath?, B. Guenet
Reviewed by P. Arboleda in May 2020

The goal of this howto is to sum up what you should not forget to run a simulation (for students and even for more experienced users).

The examples of this page are focused on offline simulations with libIGCM ("infrastructure développée à l'IPSL pour accéder, lancer et enchaîner simulation et post-traitements pour une simulation type"). Links to particular files are given for tag2.0 (created in March 2018 for CMIPv1).

This checklist is also valid in other cases (LMDZOR, no libIGCM), although the details won't hold.

1) On which machine do I want to run?

1a) On a machine maintained by the IPSL team

It can be at TGCC (irene), LSCE (obelix), IDRIS (jean-zay), IPSL mesocentre (ciclad/climserv), anywhere libIGCM is maintained by the IPSL team. You first need to obtain a login. At TGCC and IDRIS, you need to be part of an existing project or set up a new one to obtain a login. You need to contribute to the annual request for computing resources and the annual report of consumed cpu time for the previous year.

Before starting to install the model, you must set up the proper environment (paths, modules).

Read more here, which covers all the standard platforms supported by the IPSL: https://forge.ipsl.jussieu.fr/igcmg_doc/wiki/DocBenv

1b) Else you're pretty much on you own

You can also run on your local linux machine, but in this case, you will have to manage your running environment yourself: libraries, compiler, etc. Find some information here, in addition to some of the links below (Installation and compilation). Please note that if you choose this option you may receive less support from the orchidee group. Except for specific reason this option is not recommended.

2) Which code do I want to run?

2a) Choosing a revision

The best way is to use svn to download a referenced version of the code (trunk, tag, branch, perso). By using SVN, you can easily change between model versions (more information on SVN is found in this PDF), as well as incorporate bug fixes and updates from others. In this case, each version is referenced by a revision number, which corresponds to different source code, but also to potentially different "keywords" and their default values (more in 4). Therefore, only tests between identical revisions can be expected to be identical.

Some revisions are "tagged" (e.g., ORCHIDEE 2.0). A tagged revision is considered stable and does not change. Other revisions are likely undergoing continual development and modification, with occasional commits breaking them. Branches are "forks" from the main code (referred to as the "trunk"), in which developers are adding new features. It takes significant effort and knowledge of the code base to merge some of these features, which means not every feature is present in every revision of ORCHIDEE. Selecting a revision and branch should therefore take into account the research questions you are trying to address. These pages contain further information on the trunk as well as the branches.

The ORCHIDEE model is flexible, with many features controlled by input files. The model also requires additional information (e.g., spatial and temporal extents) and input data (often called, "forcing files," including meteorological forcing data) in order to run. The collection of input files that direct model behavior are referred to as the configuration. More information on configurations is found below.

ADVICE: If you want to run two versions of ORCHIDEE (e.g. a new development and the reference version it originates from), you are strongly advised to create two different directories (e.g. NEW/modipsl/... and REF/modipsl/...). It's also the case of you want to run two simulations with same code but for hard-coded parameters (but in this case, it is often better to take advantage of the "externalization" feature and define the parameters as keywords that can be defined in your PARAM/run.def, see 4a, 4d).

2b) Code options not decided by a revision

The code of ORCHIDEE includes many "code" options, i.e. options to execute some parts of the code or not. These options are controlled by some configuration files, mostly config.card and PARAM/run.def (see 4b). Since these options are not hard-coded, you can change them without recompiling. But they control your results, and you need to know them to explain your model version.

These options depend in general on the branch of ORCHIDEE you are using. As a (non exhaustive) example, the following ones are available in the CMIP6 trunk (now tagged 2.0):

  • use the STOMATE module, owing to which vegetation grows and responds to climatic conditions (STOMATE_OK_STOMATE, defined in config.card or PARAM/run.def)
  • activate the dynamic global vegetation model (DGVM), where the fraction of a grid cell covered by vegetation changes with climatic conditions (STOMATE_OK_DGVM in PARAM/run.def)
  • use the old or new driver (for offline simulations, to deal with meteorological input files): the default is the old driver but, since tag2.1, a special configuration is proposed to use the new driver (see also 4a below). More details here.
  • activate dynamic nitrogen cycle, where nitrogen scarcity can limit vegetation growth (STOMATE_IMPOSE_CN in PARAM/run.def, default in the trunk since 3.0)
  • allow vegetation to burn in fires (FIRE_DISABLE in PARAM/run.def)
  • activate soil_freezing, etc... (more details on flags and keywords in 4c)

NOTE: not all of these components can be activated at the same time and still give reliable results. Besides, somes options depend on some others (example). It's important to test the configuration you select.

2c) If you changed the code

Your developments must comply to ORCHIDEE's coding guidelines.

Your are advised to test your code as much as possible before running large simulations:

  • compile and run over a couple days in debug mode (for long simulations over large domains, debug mode in not advised as it make the execution much slower)
  • if your model crashes, turn to the Debug section of the documentation

If your model runs, check it runs correctly:

  • conservation, restartability (1+1=2)
  • verify that your results make sense compared to the revision you based your developments on

3) Installation and compilation

Read here about to download ORCHIDEE for offline (i.e., driven by separate meteorological forcing files) use: Documentation/UserGuide/InstallingORCHIDEEBasic. Information for downloading the CAN and CNP branches of ORCHIDEE also exists and can likely be adapted for any branch, but note that permission to access branches is generally restricted.

The same method is used for offline and coupled configurations, where "coupled" typically refers to coupling ORCHIDEE to an atmospheric model that generates meteorological data at every time step. Find the full documentation about installation and using of IPSL-cmc coupled models here, or go directly to a brief description of the LMDZOR_v6 configuration (LMDZ coupled to ORCHIDEE).

Always recompile your code if you make changes in the fortran code. Further information on compiling ORCHIDEE can be found here.

4) Configuration of your simulation

4a) The simplest option if you use libIGCM is to use a predefined configuration

They were defined for the reference simulations, each corresponding to a different submission directory in modipsl/config/ORCHIDEE_OL/:

  • ENSEMBLE to run an ensemble of 1D simulations at FLUXNET sites (more details)
  • SPINUP_ANALYTIC_FG1, OOL_SEC_STO_FG1trans, OOL_SEC_STO_FG2, which are designed to be used one after the other to run a simulation with CRU-NCEP atmospheric forcing and the long spinup required for the carbon cycle (more details)
  • OOL_SEC_STO_FG3, for a run without spin-up (default restart state) under the WFDEI_GPCC atmospheric forcing
  • OOL_SEC_STO_FG3nd, for a run like FG3 but with the new driver developped by Jan Polcher (this configuration is available starting at tag2.1)
  • OOL_SEC, without STOMATE (no dynamic phenology nor nutrient cycles).

In the above names, "OOL" in a directory title refers to "ORCHIDEE offline", "SEC" refers to the SEChiba module in charge of the water and energy cycles, and "STO" refers to the STOmate model controlling vegetation and nutrient cycles. OOL_SEC_STO, therefore, means the configuration is designed to use meteorological forcing files to drive the water, energy, and nutrient cycles.

4b) To create your own configuration, the simplest way is to modify an existing configuration: copy the corresponding directory with a new name in your modipsl/config/ORCHIDEE_OL/ and change what you want:

  • config.card=> to choose the name, length, restart of your simulation (more details)
  • PARAM/run.def => to choose the options of your simulation, via the "keywords". These keywords can be flag (y or n to activate them or not) or they can be parameters values that you can change without recompiling the model. The number of these keywords is huge, and it has a large effect on the outcome of your simulation. The default values depend on the revision (see 2). More details on run.def and see 4c for a list of keywords.
  • COMP/orchidee_ol.card => defines the meteorological forcing files and xml files (the latter of which are only used with XIOS; ORCHIDEE can also be run without XIOS, though it is not recommended)
  • COMP/sechiba.card => defines the other input files, and some important user option for sechiba:
    • do we change the vegetation map every year of not (VEGET_UPDATE)?
    • frequency of the output and name of the corresponding output files
    • the last part of the file is about post-processing
  • COMP/stomate.card => defines some important user option for stomate:

ADVICE: Create a new directory in your modipsl/config/ORCHIDEE_OL/ for each different run you want to keep in the end. It's the case for instance if you want to run one ORCHIDEE with different parameter sets using the keywords to define the parameters.

4c) More on keywords

The externalized parameters are parameters or flags that are not hard-coded and can be changed via an input file, namely PARAM/run.def. The advantage is that you can test various sets of parameters with the same executable and without recompiling. It can be particularly useful for parameter calibration.

These parameters are linked to "keywords", which appear in the code in capital letter and are fed with the parameter values by a "getin" call.

The list of the keywords related externalized parameters is different for different versions of the code:

WARNING: some options or parameters depend on other options. It's important to test the configuration you select. Example.

USEFUL TIP: after your simulation has run, you have a summary of externalized parameters and options in your output directory: SRF/Debug/xxx_used_ref.def

4d) Spinup and initialization

A simulation requires initial conditions, which are defined by section #D-- Restarts - in config.card.

There are two options: Restart = n ("from scratch" with arbitrary initial values) vs. Restart = y (we use state variables from a pre-existing simulation as initial conditions; only works if the two simulations have the same horizontal and vertical resolution, and the same processes thus state variables). In both cases, we usually need some warmup or spinup, unless we are just pursuing a simulation.

Read more on the ways to set up your spinup on Spinup : why, how and how long? The answer depends on the ORCHIDEE component. It is also possible to do an analytical spinup that is faster.

4e) Output

ORCHIDEE output is controlled by separate modules, referred to as XIOS ([here PDF] and webpage) and IOIPSL. These modules greatly improve reading and writing data for large simulations, although the benefits may not be seen for single pixel, single CPU runs.

If you want to change the variables that are output by your simulation, read the dedicated section History/output files.

5) Run your simulation

libIGCM is the environment that makes it easier to run simulations: gathering the required input files, copying them to the run directory, running the simulation, and then storing the output, in particular for multiple year simulations where you would otherwise have to manually copy restart files and name them appropriately. The libIGCM section of the HowTo has additional information about manipulating libIGCM in special cases.

It is possible to run simple offline test cases and larger parallel jobs without libIGCM. This can be useful when you are debugging your code.

Job, queues, etc. https://forge.ipsl.jussieu.fr/igcmg_doc/wiki/DocFsimu
LSCE specific: batch system & jobs submission https://intranet.lsce.ipsl.fr/informatique/en/calcul/batch.php

If your simulation crashes :

6) How to get some help?

6a) Look at the documentation and howto pages, training sessions, ...

6b) Ask your supervisor or close collaborators

6c) Email to orchidee-help at listes.ipsl.fr

7) Analyse your results

If your simulation(s) have run, you probably want to look at it(them)...

  • Model outputs are in netcdf format. You can find here some information on how to look at netcdf data
  • Basic checks (global mean values compared to reference simulations and observations; water conservation and twbr, etc.)
  • Some validation data have been collected and harmonized by the ORCHIDEE group, see more info on Documentation/Validation

8) How to archive your developments?

If your developments proved interesting, the ORCHIDEE community would be happy to benefit from them:

  • Presentation of your results (share your report or paper, invite the group to your seminar/defense, present a talk an ORCHIDEE weekly of DEV meeting)
  • Backup your development via svn on a branch or perso directory
  • For inclusion in the trunk, this has to be proposed to the ORCHIDEE-Project group, and a specific trusting protocol will be implemented