wiki:Documentation/UserGuide/restartability

Version 18 (modified by dgoll, 4 years ago) (diff)

--

One plus One or the challenge of restartability

Author: A.S. Lansø and S. Luyssaert
Last revision: 2020/02/28, J. Lathière Last check: 2020/04/20, D. Goll

In some rare cases after bugfixes or implementation of new code, problems with reproducibility or 1+1=2 might be introduced unintentionally. Often these are related to incorrect variable dimensions in different sub-routines, memory issues or the lack of variables in the restart files. Such issues are easier to catch sooner than later. Thus, to minimize the time spent on debugging reproducibility and 1+1=2 issues, the following simple tests are suggested/required before each commit of substantial code changes:

1+1=2

If you do not run these tests globally, make sure to use impose_veg=y. The standard F2 run.def settings have been tested and 1+1=2 from revision r6272. Thus, please always make the tests for the standard settings. In case of other run.def settings during your developments, make same tests for your settings also. More recent tests have shown that 1+1=2 for LCC with r6279 at the global scale.

The standard tests

1) 1Y vs. 12*1M, i.e. do two simulations for a full year: one with period length of 1 year; the other with period length of 1 month. Afterwards compare their final restart files both from stomate and sechiba.

Most issues should be caught with this first test. In case of problems, it will make the debugging easier, if you can track down the onset of difference between the restart files (i.e. start of year, onset of growing season, end of year etc.).

Thus, continue with tests like:

2) 1D+1D=2D (compare the final restart files)

3) 1M+1M=2M (compare the final restart files)

How to compare netcdf files

See the corresponding section.

Debugging suggestions:

  • If possible limit the spatial scale (to maximize speed).
  • Track down the onset of the deviation between the restart files.
  • Track down the problem. Hopefully, the differences in the restarts files will give you a clue on which variable to start the investigation from. The best approach depend on the source of the problem (memory issue or lack of variables in the restart file etc.). For memory issue a debugger could be the best choice. For lack of variables in restart file it is best to run two identical runs with different period lenghts – either manually or by Totalview while tracking down which variables are causing the differences.
  • Once you have fixed the problem, verify that it is also valid at the global scale (i.e. run the global tests again, if you chose to zoom in on a smaller region)

Attachments (2)

Download all attachments as: .zip