Version 10 (modified by smueller, 3 years ago) (diff) |
---|

*ENHANCE-05_SimonM-Harmonic_Analysis*

Last edition: **Wikinfo(changed_ts)?** by **Wikinfo(changed_by)?**

The PI is responsible to closely follow the progress of the action, and especially to contact NEMO project manager if the delay on preview (or review) are longer than the 2 weeks expected.

## Summary

The harmonic-analysis diagnostics available in the current reference code is limited to two-dimensional fields (surface only), is activated via a preprocessor key, uses unconventional namelist parameter names, uses a mixture of dynamic and static allocation for large arrays, and appears to be computationally inefficient. Further, while being based on multiple linear regression, the current implementation does not provide for regressions on harmonic components other than tidal constituents.

This action will replace the current tidal harmonic-analysis diagnostics with a generic implementation for multiple linear regression analysis that can be utilised for both tidal harmonic and non-tidal regression analyses. This implementation will provide harmonic-analysis diagnostics enhancements previously tested in a pre-4.0beta NEMO version by N. Bruneau: the analysis of three-dimensional fields, analysis across model restarts, and improved computational efficiency.

In contrast to both the existing harmonic analysis diagnostics in the reference NEMO code and the enhanced pre-4.0beta version by N. Bruneau, the new implementation will make extensive use of XIOS and an off-line tool. This approach should make it possible to simplify the regression analysis-related Fortran module in the core NEMO code, to relocate the regression-analysis configuration to XIOS configuration files, and to enable the selection of any model field handled by XIOS for analysis.

See ticket *#2175*.

## Preview

### Current implementation of tidal harmonic-analysis diagnostics and available enhancements

Tidal harmonic analysis of model fields based on multiple linear least-squares regression is available in the current trunk version of NEMO in module `diaharm` (source:/NEMO/trunk/src/OCE/DIA/diaharm.F90@10835). It is restricted to the analysis of sea-surface height and barotropic velocity fields, to a hard-coded maximum number of time slices within uninterrupted model runs, and its memory footprint during the whole analysis period can be very large. The current implementation has aspects that are deprecated (use of a pre-processor key, non-conventional namelist variable names, and statical allocation of potentially oversized arrays) and appears to be computationally inefficient. Further, it lacks sought-after features, such as the analysis of a wider range of model fields (including three-dimensional fields) and the possibility to span the analysis time period across restarted model runs, which have become available in more efficient alternative implementations of the same analysis method by N. Bruneau (pers. comm.) and E. O'Dea (https://forge.ipsl.jussieu.fr/nemo/browser/branches/UKMO/dev_5518_tid_analysis_restart) in previous NEMO versions. In addition, the harmonic analysis diagnostics of module `diaharm` remains deactivated, and thus its compilability untested, during the standard tests of the SETTE test suite.

### Proposed replacement of the tidal harmonic-analysis diagnostics by a generic formulation for multiple linear regression analysis

Rather than adapting and upgrading an existing implementation of harmonic-analysis diagnostics (the current implementation in the trunk version or an existing alternative implementation) for the latest reference version of NEMO, the development of generic multiple linear least-squares regression analysis diagnostics for NEMO as outlined below in the form of a new module `diamlr` (source:/NEMO/trunk/src/OCE/DIA/diamlr.F90), additional XIOS configuration, and an off-line tool is proposed. This new diagnostics would readily allow for tidal harmonic analysis, and module `diaharm` (source:/NEMO/trunk/src/OCE/DIA/diaharm.F90) could be removed. In addition to tidal harmonic analysis the new development would facilitate non-tidal applications, such as the identification of the seasonal cycle or linear trends in model fields.

### The use of XIOS for regression diagnostics

In general, least-squares linear regression can be formulated in terms of scalar products of all possible pairings between the dependent variable, |y>, and the regressors, |x_{m}> (the index identifies the regressor), where each vector component represents the corresponding value at each of the time steps included in the analysis. In particular, during the model run, it suffices to accumulate the scalar products <x_{m}|y> and <x_{m}|x_{n}> by summing up the respective products for each time step; at the end of the analysis interval, which does not need to be known in advance, the regression analysis can be finalised using pre-computed scalar products.

The computation of the regressors formulated as functions of time, the computation of the scalar products, and the output of the scalar products in model runs with enabled linear regression analysis can be delegated to the I/O server XIOS. This would have three main benefits:

- as the fields of dependent variables selected for regression analysis are typically already available in XIOS processes for regular model output, the NEMO processes would no longer have to access the large fields selected for analysis directly;

- the potentially large additional amounts of temporary storage required for the computation of the scalar products would be provided by XIOS processes, and so the memory footprint of the NEMO processes would hardly be affected when enabling linear regression analysis; and

- output files of partial scalar products for partial model runs can readily be generated by XIOS and re-combined later (through addition) in order to prepare sets of scalar products for various analysis intervals, including analysis intervals that span across model restarts.

### Configuration interface, internal XIOS configuration, and model time

The user configuration of the multiple linear regression diagnostics (definition of regressors, selection of fields for analysis, and output frequency) is proposed to be made feasible entirely in the form of additional XIOS configuration.

A field definition for each regressor would define the arithmetic expression as a function of the model time (`diamlr_time`) and, if required, use placeholders for parameters of tidal constituents. As an example, a configuration section similar to

<field id="diamlr_time" grid_ref="diamlr_grid" /> <field id="diamlr_r001" field_ref="diamlr_time" expr="diamlr_time^0.0" enabled=".TRUE." /> <field id="diamlr_r002" field_ref="diamlr_time" expr="diamlr_time^1.0" enabled=".TRUE." /> <field id="diamlr_r003" field_ref="diamlr_time" expr="diamlr_time^2.0" enabled=".TRUE." /> <field id="diamlr_r004" field_ref="diamlr_time" expr="sin( 1.992384990861e-7 * diamlr_time )" enabled=".TRUE." /> <field id="diamlr_r005" field_ref="diamlr_time" expr="cos( 1.992384990861e-7 * diamlr_time )" enabled=".TRUE." /> <field id="diamlr_r006" field_ref="diamlr_time" expr="__TDE_M2_amplitude__ * sin( __TDE_M2_omega__ * diamlr_time + __TDE_M2_phase__ )" enabled=".TRUE." /> <field id="diamlr_r007" field_ref="diamlr_time" expr="__TDE_M2_amplitude__ * cos( __TDE_M2_omega__ * diamlr_time + __TDE_M2_phase__ )" enabled=".TRUE." />

would define seven regressors: three regressors required to fit a polynomial of degree 2; two orthogonal, sinusoidal regressors required to fit the seasonal cycle; and two orthogonal, sinusoidal regressors of the M2 tidal-constituent frequency with parameters provided by the tidal-forcing implementation in NEMO through substitution of placeholders during model initialisation.

The fields selected for the analysis could be listed as dedicated fields that make reference to fields available for model output. As an example, a configuration similar to

<field id="diamlr_f001" field_ref="ssh" enabled=".TRUE." /> <field id="diamlr_f002" field_ref="toce" enabled=".TRUE." />

would select the sea-surface height and potential temperature fields for regression analysis.

Overall, the frequency of intermediate data output would be set using a configuration line similar to

<file_group id="diamlr_files" output_freq="1y" enabled=".TRUE." />,

which would result in the output of annual scalar products.

In order to process and output the full set of scalar products required for the least-squares linear regression analysis, however, a more complex XIOS configuration
is required. This complex configuration of XIOS can be derived from the simple user configuration outlined above. Therefore, during the initialisation of the relevant XIOS context in NEMO, the regression-specific user configuration can be read out and in turn the full configuration can be generated using the XIOS API (currently, XIOS contexts are defined and **closed** in subroutine `iom_init` of module `iom`, source:/NEMO/trunk/src/OCE/IOM/iom.F90@10817; the new implementation would modify module `iom` so that closing the XIOS context can optionally be deferred in order to enable other subroutines to add or modify the XIOS configuration).

The substitution of tidal-constituent parameters in regressor expressions would make use of recent developments of the tidal forcing mechanism (see ticket #2194), and therefore the development branch at source:/NEMO/branches/2019/dev_r10742_ENHANCE-12_SimonM-Tides/ would be merged into the development branch for the new linear regression analysis diagnostics.

In order to enable seamless analysis across restart boundaries, regressor expressions would require to be formulated as a function of a continuous model time that extends across model restarts. Therefore, the new implementation would send the time from the start of the simulation (available as variable `adatrj`) to XIOS as field `diamlr_time`.