ENHANCE05_SimonMHarmonic_Analysis
Last edition: Wikinfo(changed_ts)? by Wikinfo(changed_by)?
The PI is responsible to closely follow the progress of the action, and especially to contact NEMO project manager if the delay on preview (or review) are longer than the 2 weeks expected.
 Summary

Preview
 Current implementation of tidal harmonicanalysis diagnostics and available enhancements
 Proposed replacement of the tidal harmonicanalysis diagnostics by a generic formulation for multiple linear regression analysis
 The use of XIOS for regression diagnostics
 Configuration interface, internal XIOS configuration, and model time
 Finalisation of the regression analysis
 Proposed replacement of module dia25h
 Previewer's comments
 Reply to previewer's comments
 Tests
 Documentation
 Further development
 Review
Summary
The harmonicanalysis diagnostics available in the current reference code is limited to twodimensional fields (surface only), is activated via a preprocessor key, uses unconventional namelist parameter names, uses a mixture of dynamic and static allocation for large arrays, and appears to be computationally inefficient. Further, while being based on multiple linear regression, the current implementation does not provide for regressions on harmonic components other than tidal constituents.
This action will replace the current tidal harmonicanalysis diagnostics with a generic implementation for multiple linear regression analysis that can be utilised for both tidal harmonic and nontidal regression analyses. This implementation will provide harmonicanalysis diagnostics enhancements previously tested in a pre4.0beta NEMO version by N. Bruneau: the analysis of threedimensional fields, analysis across model restarts, and improved computational efficiency.
In contrast to both the existing harmonic analysis diagnostics in the reference NEMO code and the enhanced pre4.0beta version by N. Bruneau, the new implementation will make extensive use of XIOS and an offline tool. This approach should make it possible to simplify the regression analysisrelated Fortran module in the core NEMO code, to relocate the regressionanalysis configuration to XIOS configuration files, and to enable the selection of any model field handled by XIOS for analysis.
See ticket #2175.
Preview
Current implementation of tidal harmonicanalysis diagnostics and available enhancements
Tidal harmonic analysis of model fields based on multiple linear leastsquares regression is available in the current trunk version of NEMO in module diaharm (source:/NEMO/trunk/src/OCE/DIA/diaharm.F90@10835). It is restricted to the analysis of seasurface height and barotropic velocity fields, to a hardcoded maximum number of time slices within uninterrupted model runs, and its memory footprint during the whole analysis period can be very large. The current implementation has aspects that are deprecated (use of a preprocessor key, nonconventional namelist variable names, and statical allocation of potentially oversized arrays) and appears to be computationally inefficient. Further, it lacks soughtafter features, such as the analysis of a wider range of model fields (including threedimensional fields) and the possibility to span the analysis time period across restarted model runs, which have become available in more efficient alternative implementations of the same analysis method by N. Bruneau (pers. comm.) and E. O'Dea (https://forge.ipsl.jussieu.fr/nemo/browser/branches/UKMO/dev_5518_tide_analysis_restart) in previous NEMO versions. In addition, the harmonic analysis diagnostics of module diaharm remains deactivated, and thus its compilability untested, during the standard tests of the SETTE test suite.
Proposed replacement of the tidal harmonicanalysis diagnostics by a generic formulation for multiple linear regression analysis
Rather than adapting and upgrading an existing implementation of harmonicanalysis diagnostics (the current implementation in the trunk version or an existing alternative implementation) for the latest reference version of NEMO, the development of generic multiple linear leastsquares regression analysis diagnostics for NEMO as outlined below in the form of a new module diamlr (source:/NEMO/trunk/src/OCE/DIA/diamlr.F90), additional XIOS configuration, and an offline tool is proposed. This new diagnostics would readily allow for tidal harmonic analysis, and module diaharm (source:/NEMO/trunk/src/OCE/DIA/diaharm.F90) could be removed. In addition to tidal harmonic analysis the new development would facilitate nontidal applications, such as the identification of the seasonal cycle or linear trends in model fields.
The use of XIOS for regression diagnostics
In general, leastsquares linear regression can be formulated in terms of scalar products of all possible pairings between the dependent variable, y>, and the regressors, x_{m}> (the index identifies the regressor), where each vector component represents the corresponding value at each of the time steps included in the analysis. In particular, during the model run, it suffices to accumulate the scalar products <x_{m}y> and <x_{m}x_{n}> by summing up the respective products for each time step; at the end of the analysis interval, which does not need to be known in advance, the regression analysis can be finalised using precomputed scalar products.
The computation of the regressors formulated as functions of time, the computation of the scalar products, and the output of the scalar products in model runs with enabled linear regression analysis can be delegated to the I/O server XIOS. This would have three main benefits:
 as the fields of dependent variables selected for regression analysis are typically already available in XIOS processes for regular model output, the NEMO processes would no longer have to access the large fields selected for analysis directly;
 the potentially large additional amounts of temporary storage required for the computation of the scalar products would be provided by XIOS processes, and so the memory footprint of the NEMO processes would hardly be affected when enabling linear regression analysis; and
 output files of partial scalar products for partial model runs can readily be generated by XIOS and recombined later (through addition) in order to prepare sets of scalar products for various analysis intervals, including analysis intervals that span across model restarts.
Configuration interface, internal XIOS configuration, and model time
The user configuration of the multiple linear regression diagnostics (definition of regressors, selection of fields for analysis, and output frequency) is proposed to be made feasible entirely in the form of additional XIOS configuration.
A field definition for each regressor would define the arithmetic expression as a function of the model time (diamlr_time) and, if required, use placeholders for parameters of tidal constituents. As an example, a configuration section similar to
<field id="diamlr_time" grid_ref="diamlr_grid" /> <field id="diamlr_r001" field_ref="diamlr_time" expr="diamlr_time^0.0" enabled=".TRUE." /> <field id="diamlr_r002" field_ref="diamlr_time" expr="diamlr_time^1.0" enabled=".TRUE." /> <field id="diamlr_r003" field_ref="diamlr_time" expr="diamlr_time^2.0" enabled=".TRUE." /> <field id="diamlr_r004" field_ref="diamlr_time" expr="sin( 1.992384990861e7 * diamlr_time )" enabled=".TRUE." /> <field id="diamlr_r005" field_ref="diamlr_time" expr="cos( 1.992384990861e7 * diamlr_time )" enabled=".TRUE." /> <field id="diamlr_r006" field_ref="diamlr_time" expr="__TDE_M2_amplitude__ * sin( __TDE_M2_omega__ * diamlr_time + __TDE_M2_phase__ )" enabled=".TRUE." /> <field id="diamlr_r007" field_ref="diamlr_time" expr="__TDE_M2_amplitude__ * cos( __TDE_M2_omega__ * diamlr_time + __TDE_M2_phase__ )" enabled=".TRUE." />
would define seven regressors: three regressors required to fit a polynomial of degree 2; two orthogonal, sinusoidal regressors required to fit the seasonal cycle; and two orthogonal, sinusoidal regressors of the M2 tidalconstituent frequency with parameters provided by the tidalforcing implementation in NEMO through substitution of placeholders during model initialisation.
The fields selected for the analysis could be listed as dedicated fields that make reference to fields available for model output. As an example, a configuration similar to
<field id="diamlr_f001" field_ref="ssh" enabled=".TRUE." /> <field id="diamlr_f002" field_ref="toce" enabled=".TRUE." />
would select the seasurface height and potential temperature fields for regression analysis.
Overall, the frequency of intermediate data output would be set using a configuration line similar to
<file_group id="diamlr_files" output_freq="1y" enabled=".TRUE." />,
which would result in the output of annual scalar products.
In order to process and output the full set of scalar products required for the leastsquares linear regression analysis, however, a more complex XIOS configuration is required. This complex configuration of XIOS can be derived from the simple user configuration outlined above. Therefore, during the initialisation of the relevant XIOS context in NEMO, the regressionspecific user configuration can be read out and in turn the full configuration can be generated using the XIOS API (currently, XIOS contexts are defined and closed in subroutine iom_init of module iom, source:/NEMO/trunk/src/OCE/IOM/iom.F90@10817; the new implementation would modify module iom so that closing the XIOS context can optionally be deferred in order to enable other subroutines to add or modify the XIOS configuration).
The substitution of tidalconstituent parameters in regressor expressions would make use of recent developments of the tidal forcing mechanism (see ticket #2194), and therefore the development branch at source:/NEMO/branches/2019/dev_r10742_ENHANCE12_SimonMTides/ would be merged into the development branch for the new linear regression analysis diagnostics.
In order to enable seamless analysis across restart boundaries, regressor expressions would require to be formulated as a function of a continuous model time that extends across model restarts. Therefore, the new implementation would send the time from the start of the simulation (available as variable adatrj) to XIOS as field diamlr_time.
Finalisation of the regression analysis
The development of an external tool for the finalisation of the multiple linear regression analysis using a set of intermediate scalar products as described above is proposed. This new tool would be added at source:/utils/tools/DIAMLR_TOOLS/.
Proposed replacement of module dia25h
Related to tidal harmonic analysis is the approximative M2detiding of model output fields provided by module dia25h (source:/NEMO/trunk/src/OCE/DIA/dia25h.F90@10641). This functionality roughly approximates the tidal period of the semidiurnal tidal constituent M2 as 12.5 hours, it provides the detiding of a fixed set of fields, it requires relatively short time steps whose duration in seconds is an integer divisor of 3600 seconds, and its memory footprint is relatively large.
In order to address these issues, the removal of module dia25h and its replacement by a XIOSbased alternative mechanism implemented in a new module diam2detiding (source:/NEMO/trunk/src/OCE/DIA/diam2detiding.F90) is proposed. While the current implementation in module dia25h averages the centred values of 25 1hour intervals that are centred around the middle of each day (i.e., the values at the top of each hour including midnight at both ends of each 24hour averaging period), the revised mechanism would identify up to 29 intervals of equal duration that cover 24.842 hours, twice the period of the tidal constituent M2 of 12.421 hours (https://en.wikipedia.org/wiki/Earth_tide#Tidal_constituents), with centre points falling within one complete 24hour interval centred around midday (it can be shown that 29 is the largest number of periodic samples that achieve this). For each full 24hour period starting at midnight (not including midnight on one side of this period), a weight would be generated at each time step so that the sum of a model time series multiplied by these weights would compute the average of the centre points of the intervals described above. This time series of weights could be sent to XIOS; with this, XIOS could be configured to provide approximative M2detiding of any model field that is available for output.
Previewer's comments
The use of XIOS is particularly ingenious to perform both the tidal harmonic analysis and the detiding (i.e. the existing 25h moving average). The development addresses one of the main caveats of the existing harmonic analysis, i.e. the impossibility to perform analysis over more than two model restart time (though this could have been implemented in the online version too).
 Since it can be tedious for some users to step into XIOS logic, I recommend to pay attention to the documentation of the new tidal harmonic method and give practical, simple, examples. The AMM12 reference configuration should embark the new method as a guideline.
 I suggest that the existing 25 h average being kept in the code, at the expense of duplicating two quasiidentical features for a while. The reason is that online detiding may be soon useful to implement an Internal Wave drag parameterization in the code (this parameterization should work only on the tidal component of the flow). Again practical example should be given in AMM12.
 Would it be possible to use the results of an existing tidal analysis to perform the detiding through XIOS ? That could be an even more efficient alternative to the 25 h filtering. I'm afraid that harmonics would have to be read in NEMO ?
Reply to previewer's comments
As recommended by the previewer, the proposed multiple linear regression analysis facility is proposed to be well documented in the NEMO manual, supported by examples. Fruther, much of the XIOS configuration required for tidal harmonic analysis can be included in the default configuration file source:/NEMO/trunk/cfgs/SHARED/field_def_nemoice.xml, so that the additional XIOS configuration required to enable output of the intermediate data remains relatively simple. As suggested, an example analysis is proposed to be enabled in the reference configuration AMM12; this has the additional benefit of routine testing of a part of the proposed analysis implementation as part of the SETTE test suite.
As suggested by the previewer, the existing module dia25h will not be removed as part of this action. The existing and proposed new detiding mechanisms are independent of each other, therefore the new variant can be implemented in a new module as planned. Further, the previewer alludes to potential alternative developments relating to detiding. While alternative detiding mechanisms can be explored outside the scope of the present workplan action, the new module could potentially provide an ideal place to implement such developments in the future. Therefore, the new module is proposed to be called less specifically diadetiding rather than diam2detiding.
I would like to thank the previewer for previewing the proposed new implementations for tidal diagnostics in NEMO, as well as for useful recommendations and suggestions.
Tests
Development branch after synchronisation with NEMO/trunk@12072 and before the merge with wiki:2019WP/ENHANCE12_SimonMTides developments
SETTE results for NEMO/branches/2019/dev_r11879_ENHANCE05_SimonMHarmonic_Analysis@12097:
restart  repro  agrif check  NEMO/trunk@12072  

GYRE_PISCES  OK  OK    identical 
ORCA2_ICE_PISCES  OK  OK    identical 
ORCA2_OFF_PISCES  OK  OK    identical 
AMM12  OK  OK    identical 
ORCA2_SAS_ICE  OK  OK    identical 
AGRIF_DEMO  OK  OK  OK  identical 
SPITZ12  OK  OK    identical 
ISOMIP  OK  OK    identical 
OVERFLOW  OK  OK     
LOCK EXCHANGE  OK       
VORTEX  OK  OK    identical 
ICE AGRIF  OK  OK    identical 
Development branch after synchronisation with NEMO/trunk@12072 and after the merge with wiki:2019WP/ENHANCE12_SimonMTides developments
SETTE results for NEMO/branches/2019/dev_r11879_ENHANCE05_SimonMHarmonic_Analysis@12122:
restart  repro  agrif check  comparison with NEMO/branches/2019/dev_r10742_ENHANCE12_SimonMTides@12096  NEMO/trunk@12072  

GYRE_PISCES  OK  OK    identical  identical 
ORCA2_ICE_PISCES  OK  OK    identical  identical 
ORCA2_OFF_PISCES  OK  OK    identical  identical 
AMM12  OK  OK    identical  DIFFERENT* 
ORCA2_SAS_ICE  OK  OK    identical  identical 
AGRIF_DEMO  OK  OK  OK  identical  identical 
SPITZ12  OK  OK    identical  DIFFERENT* 
ISOMIP  OK  OK    identical  identical 
OVERFLOW  OK  OK       
LOCK EXCHANGE  OK         
VORTEX  OK  OK    identical  identical 
ICE AGRIF  OK  OK    identical  identical 
*) The differences between the development and trunk variants of both the AMM12 and SPITZ12 test output are an expected result of the ENHANCE12_SimonMTides development.
Documentation
The documentation of the generic formulation for multiple linear regression analysis proposed in the preview is planned to be made as part of the 2020WP action 2020WP/ENHANCE11_smueller_DIAMLR.
Further development
Further development of the tool for the finalisation of the regression analysis (source:/utils/tools/DIAMLR) is planned to be made as part of the 2020WP action 2020WP/ENHANCE11_smueller_DIAMLR.