wiki:DevelopmentActivities/ORCHIDEE-ML-Spinup

Version 13 (modified by dgoll, 3 years ago) (diff)

--

Spin up with a Machine Learning approach

What is it about?

Aim: develop a spinup acceleration procedure which is model version independent. The idea is to develop a python tool set which can applied to the ORCHIDEE family of models.

How can I contribute to this effort?

Please contact the D.Goll if you want to join. Some example we would benefit from are:

  • data from conventional spinup simulations
  • expertise how to link it to other tools, like libIGCM, ORCHIDAS etc.
  • expertise how to host/distribute/maintain the software
  • machine learning, python

Task force members

Daniel Goll, Yan Sun, Jinfeng Chang, Yilong Wang, Yuanyuan Huang, Vladislav Bastrikov, Nicolas Viovy Matt McGrath?

Status reports

26/01/2021

  • DONE: Proof of concept for ORCHIDEE-CNP v1.2
  • ONGOING: Finding a common setup for pixel selection applicable to all ORCHIDEE versions
  • ONGOING: Collecting data from other ORCHIDEE versions for testing
  • ONGOING: Translating matlab into python code
  • ONGOING: Cleaning the code
  • ONGOING: Recruiting task force members

16/02/2021

Yan gave a presentation on progress with python coding, results on CNP and trunk, and timeline for next 2 months.

  • Input files: restart + climate forcing (not hist file as might ORCHIDEE might introduce noise)
  • K-means clustering: add plot which shows the total distance vs k to monitor if the chosen number of cluster paranmeter is well chosen (part of the monitoring info for user)
  • Add checks and quality statistics to monitor if each steps performs well & stop the procedure is results fail minimum quality criteria (e.g. stop if machine learning fails to predict training pixels)
  • Externalize all parameters of the routines in one file.

Work distribution:

  • Matt: Provide trunk v4.0 data (EQ files, + results from 200yr after scratch w/o anal spinup)
  • Yilong refines & extend coding of tool 1&2
  • Run tests with the refined tools for other forcings (everyone)
  • Yan will focus next month on PhD defens (20.March)

03/03/2021

  • First version of python tools are available for testing
  • Yilong gave an overview

Next steps:

  • put code and documentation on github (Daniel, Vlad, Yilong)
  • add documentation on how to run the tools; adapt them to other models (Yan,Yilong)

  • all attempt to run the tools with their model data (keep a log on github about what model data used)

information/suggestions on run the tools:

  • user specification files: need more information, e.g. what file name corresponds to Equilibirum information what to info from transient run (Yan)
  • things to improve: figure labelling, user spec file (simplify)
  • try to use qsub to avoid blocking nodes on obelix

16/03/2021

  • github has been setup and some initial test and exchanges were done
  • next: everyone try and test the tool on the two available datasets (CNP, trunk); report bugs, improvmenets, etc on github
  • ongoing: acquire data from other model (versions): CABLE, ORCHIDEE-MICT, ORCHIDEE-<any>
  • next meeting will be scheduled after discussion with Yan after her defence

01/04/2021

  • github code status: YY could run the code, DG did some test modifying some inputs, all detected (minor) problems are listed in issues in github
  • TODO1 (yan): provide information in README how to insert data from other simulation; separate the user specification files into experiment specific (e.g. path to model output, forcing period (for tool 3), etc) and model version specific (e.g. CNP, MICT, Trunk, CABLE, etc).
  • TODO2 (yan): provide a tool 2 output which condense the information from now multiple files into a single file.
  • TOdO3 (yan): work on the manuscript (incl. results from test with other model versions (if feasible from TODO4) and CABLE)
  • TODO4 (YY, DG, all): test the tools 1 and 2 when TODO1 and TODO2 are ready.
  • TODO5 (DG): discuss with project team about the running scripts.
  • TODO6( Yan) : code a evaluation tool (tool 3); check criterias are (1) high priority (total land C stock), (2) medium priority (land C stock on pixel), (3) others / drift over forcing period (i.e. climate loop).