# Changeset 11316 for NEMO/trunk/doc/latex/NEMO/subfiles/chap_OBS.tex

Ignore:
Timestamp:
2019-07-19T19:20:02+02:00 (14 months ago)
Message:

#2297 Updates to OBS and ASM documentation

File:
1 edited

### Legend:

Unmodified
 r11151 \label{chap:OBS} Authors: D. Lea, M. Martin, K. Mogensen, A. Vidard, A. Weaver, A. Ryan, ...   % do we keep that ? \minitoc \vfill \begin{figure}[b] \subsubsection*{Changes record} \begin{tabular}{l||l|m{0.65\linewidth}} Release   & Author        & Modifications \\ {\em 4.0} & {\em D. J. Lea} & {\em \NEMO 4.0 updates}  \\ {\em 3.6} & {\em M. Martin, A. Ryan} & {\em Add averaging operator, standalone obs oper} \\ {\em 3.4} & {\em D. J. Lea, M. Martin, ...} & {\em Initial version}  \\ {\em --\texttt{"}--} & {\em ... K. Mogensen, A. Vidard, A. Weaver} & {\em ---\texttt{"}---}  \\ \end{tabular} \end{figure} \newpage The observation and model comparison code (OBS) reads in observation files (profile temperature and salinity, sea surface temperature, sea level anomaly, sea ice concentration, and velocity) and calculates an interpolated model equivalent value at the observation location and nearest model timestep. The observation and model comparison code, the observation operator (OBS), reads in observation files (profile temperature and salinity, sea surface temperature, sea level anomaly, sea ice concentration, and velocity) and calculates an interpolated model equivalent value at the observation location and nearest model time step. The resulting data are saved in a feedback'' file (or files). The code was originally developed for use with the NEMOVAR data assimilation code, but can be used for validation or verification of the model or with any other data assimilation system. The OBS code is called from \mdl{nemogcm} for model initialisation and to calculate the model equivalent values for observations on the 0th timestep. The code is then called again after each timestep from \mdl{step}. The code is only activated if the namelist logical \np{ln\_diaobs} is set to true. The OBS code is called from \mdl{nemogcm} for model initialisation and to calculate the model equivalent values for observations on the 0th time step. The code is then called again after each time step from \mdl{step}. The code is only activated if the \ngn{namobs} namelist logical \np{ln\_diaobs} is set to true. For all data types a 2D horizontal interpolator or averager is needed to For {\em in situ} profiles, a 1D vertical interpolator is needed in addition to provide model fields at the observation depths. This now works in a generalised vertical coordinate system. This now works in a generalised vertical coordinate system. Some profile observation types (\eg tropical moored buoys) are made available as daily averaged quantities. the observation operator code can calculate equivalent night-time average model SST fields by setting the namelist value \np{ln\_sstnight} to true. Otherwise the model value from the nearest timestep to the observation time is used. The code is controlled by the namelist \textit{namobs}. Otherwise (by default) the model value from the nearest time step to the observation time is used. The code is controlled by the namelist \ngn{namobs}. See the following sections for more details on setting up the namelist. \autoref{sec:OBS_example} introduces a test example of the observation operator code including In \autoref{sec:OBS_example} a test example of the observation operator code is introduced, including where to obtain data and how to setup the namelist. \autoref{sec:OBS_details} introduces some more technical details of the different observation types used and also shows a more complete namelist. \autoref{sec:OBS_theory} introduces some of the theoretical aspects of the observation operator including In \autoref{sec:OBS_details} some more technical details of the different observation types used are introduced, and we also show a more complete namelist. In \autoref{sec:OBS_theory} some of the theoretical aspects of the observation operator are described including interpolation methods and running on multiple processors. \autoref{sec:OBS_ooo} describes the offline observation operator code. \autoref{sec:OBS_obsutils} introduces some utilities to help working with the files produced by the OBS code. In \autoref{sec:OBS_sao} the standalone observation operator code is described. In \autoref{sec:OBS_obsutils} we describe some utilities to help work with the files produced by the OBS code. % ================================================================ \label{sec:OBS_example} This section describes an example of running the observation operator code using profile data which can be freely downloaded. It shows how to adapt an existing run and build of NEMO to run the observation operator. In this section an example of running the observation operator code is described using profile observation data which can be freely downloaded. It shows how to adapt an existing run and build of \NEMO to run the observation operator. Note also the observation operator and the assimilation increments code are run in the \np{ORCA2\_ICE\_OBS} SETTE test. \begin{enumerate} \item Download some EN4 data from \href{http://www.metoffice.gov.uk/hadobs}{www.metoffice.gov.uk/hadobs}. Choose observations which are valid for the period of your test run because the observation operator compares the model and observations for a matching date and time. \item Compile the OBSTOOLS code using: the observation operator compares the model and observations for a matching date and time. \item Compile the OBSTOOLS code in the \np{tools} directory using: \begin{cmds} ./maketools -n OBSTOOLS -m [ARCH]. ./maketools -n OBSTOOLS -m [ARCH] \end{cmds} \item Convert the EN4 data into feedback format: replacing \np{[ARCH]} with the build architecture file for your machine. Note the tools are checked out from a separate repository under \np{utils/tools}. \item Convert the EN4 data into feedback format: \begin{cmds} enact2fb.exe profiles_01.nc EN.4.1.1.f.profiles.g10.YYYYMM.nc \end{cmds} \item Include the following in the NEMO namelist to run the observation operator on this data: \item Include the following in the \NEMO namelist to run the observation operator on this data: \end{enumerate} This can be expensive, particularly for large numbers of observations, setting \np{ln\_grid\_search\_lookup} allows the use of a lookup table which is saved into an xypos file (or files). is saved into an \np{cn\_gridsearch} file (or files). This will need to be generated the first time if it does not exist in the run directory. However, once produced it will significantly speed up future grid searches. Setting \np{ln\_grid\_global} means that the code distributes the observations evenly between processors. Alternatively each processor will work with observations located within the model subdomain (see section~\autoref{subsec:OBS_parallel}). (see \autoref{subsec:OBS_parallel}). A number of utilities are now provided to plot the feedback files, convert and recombine the files. These are explained in more detail in section~\autoref{sec:OBS_obsutils}. Utilites to convert other input data formats into the feedback format are also described in section~\autoref{sec:OBS_obsutils}. These are explained in more detail in \autoref{sec:OBS_obsutils}. Utilities to convert other input data formats into the feedback format are also described in \autoref{sec:OBS_obsutils}. \section{Technical details (feedback type observation file headers)} %------------------------------------------------------------------------------------------------------------- The observation operator code uses the "feedback" observation file format for all data types. The observation operator code uses the feedback observation file format for all data types. All the observation files must be in NetCDF format. Some example headers (produced using \mbox{\textit{ncdump~-h}}) for profile data, sea level anomaly and sea surface temperature are in the following subsections. \subsection{Profile feedback} \subsection{Profile feedback file} \begin{clines} \end{clines} \subsection{Sea level anomaly feedback} \subsection{Sea level anomaly feedback file} \begin{clines} \end{clines} The mean dynamic topography (MDT) must be provided in a separate file defined on To use Sea Level Anomaly (SLA) data the mean dynamic topography (MDT) must be provided in a separate file defined on the model grid called \ifile{slaReferenceLevel}. The MDT is required in order to produce the model equivalent sea level anomaly from the model sea surface height. \end{clines} \subsection{Sea surface temperature feedback} \subsection{Sea surface temperature feedback file} \begin{clines} In those cases the model counterpart should be calculated by averaging the model grid points over the same size as the footprint. NEMO therefore has the capability to specify either an interpolation or an averaging (for surface observation types only). \NEMO therefore has the capability to specify either an interpolation or an averaging (for surface observation types only). The main namelist option associated with the interpolation/averaging is \np{nn\_2dint}. \item \np{nn\_2dint}\forcode{ = 4}: Polynomial interpolation \item \np{nn\_2dint}\forcode{ = 5}: Radial footprint averaging with diameter specified in the namelist as \np{rn\_???\_avglamscl} in degrees or metres (set using \np{ln\_???\_fp\_indegs}) \np{rn\_[var]\_avglamscl} in degrees or metres (set using \np{ln\_[var]\_fp\_indegs}) \item \np{nn\_2dint}\forcode{ = 6}: Rectangular footprint averaging with E/W and N/S size specified in the namelist as \np{rn\_???\_avglamscl} and \np{rn\_???\_avgphiscl} in degrees or metres (set using \np{ln\_???\_fp\_indegs}) the namelist as \np{rn\_[var]\_avglamscl} and \np{rn\_[var]\_avgphiscl} in degrees or metres (set using \np{ln\_[var]\_fp\_indegs}) \end{itemize} The ??? in the last two options indicate these options should be specified for each observation type for Replace \np{[var]} in the last two options with the observation type (sla, sst, sss or sic) for which the averaging is to be performed (see namelist example above). The \np{nn\_2dint} default option can be overridden for surface observation types using namelist values \np{nn\_2dint\_???} where ??? is one of sla,sst,sss,sic. namelist values \np{nn\_2dint\_[var]} where \np{[var]} is the observation type. Below is some more detail on the various options for interpolation and averaging available in NEMO. \subsubsection{Horizontal interpolation} Consider an observation point ${\mathrm P}$ with with longitude and latitude $({\lambda_{}}_{\mathrm P}, \phi_{\mathrm P})$ and Consider an observation point ${\mathrm P}$ with longitude and latitude (${\lambda_{}}_{\mathrm P}$, $\phi_{\mathrm P}$) and the four nearest neighbouring model grid points ${\mathrm A}$, ${\mathrm B}$, ${\mathrm C}$ and ${\mathrm D}$ with longitude and latitude ($\lambda_{\mathrm A}$, $\phi_{\mathrm A}$),($\lambda_{\mathrm B}$, $\phi_{\mathrm B}$) etc. All horizontal interpolation methods implemented in NEMO estimate the value of a model variable $x$ at point $P$ as All horizontal interpolation methods implemented in \NEMO estimate the value of a model variable $x$ at point $P$ as a weighted linear combination of the values of the model variables at the grid points ${\mathrm A}$, ${\mathrm B}$ etc.: \begin{align*} {x_{}}_{\mathrm P} & \hspace{-2mm} = \hspace{-2mm} & \frac{1}{w} \left( {w_{}}_{\mathrm A} {x_{}}_{\mathrm A} + {w_{}}_{\mathrm B} {x_{}}_{\mathrm B} + {w_{}}_{\mathrm C} {x_{}}_{\mathrm C} + {w_{}}_{\mathrm D} {x_{}}_{\mathrm D} \right) {x_{}}_{\mathrm P} = \frac{1}{w} \left( {w_{}}_{\mathrm A} {x_{}}_{\mathrm A} + {w_{}}_{\mathrm B} {x_{}}_{\mathrm B} + {w_{}}_{\mathrm C} {x_{}}_{\mathrm C} + {w_{}}_{\mathrm D} {x_{}}_{\mathrm D} \right) \end{align*} where ${w_{}}_{\mathrm A}$, ${w_{}}_{\mathrm B}$ etc. are the respective weights for the model field at points ${\mathrm A}$, ${\mathrm B}$ etc., and $w = {w_{}}_{\mathrm A} + {w_{}}_{\mathrm B} + {w_{}}_{\mathrm C} + {w_{}}_{\mathrm D}$. For example, the weight given to the field ${x_{}}_{\mathrm A}$ is specified as the product of the distances from ${\mathrm P}$ to the other points: \begin{align*} \begin{alignat*}{2} {w_{}}_{\mathrm A} = s({\mathrm P}, {\mathrm B}) \, s({\mathrm P}, {\mathrm C}) \, s({\mathrm P}, {\mathrm D}) \end{align*} where \begin{align*} s\left ({\mathrm P}, {\mathrm M} \right ) & \hspace{-2mm} = \hspace{-2mm} & \cos^{-1} \! \left\{ \end{alignat*} where \begin{alignat*}{2} s\left({\mathrm P}, {\mathrm M} \right) & = & \hspace{0.25em} \cos^{-1} \! \left\{ \sin {\phi_{}}_{\mathrm P} \sin {\phi_{}}_{\mathrm M} + \cos {\phi_{}}_{\mathrm P} \cos {\phi_{}}_{\mathrm M} \cos ({\lambda_{}}_{\mathrm M} - {\lambda_{}}_{\mathrm P}) + \cos {\phi_{}}_{\mathrm P} \cos {\phi_{}}_{\mathrm M} \cos ({\lambda_{}}_{\mathrm M} - {\lambda_{}}_{\mathrm P}) \right\} \end{align*} \end{alignat*} and $M$ corresponds to $B$, $C$ or $D$. A more stable form of the great-circle distance formula for small distances ($x$ near 1) involves the arcsine function (\eg see p.~101 of \citet{daley.barker_bk01}: \begin{align*} s\left( {\mathrm P}, {\mathrm M} \right) & \hspace{-2mm} = \hspace{-2mm} & \sin^{-1} \! \left\{ \sqrt{ 1 - x^2 } \right\} \end{align*} \begin{alignat*}{2} s\left( {\mathrm P}, {\mathrm M} \right) = \sin^{-1} \! \left\{ \sqrt{ 1 - x^2 } \right\} \end{alignat*} where \begin{align*} x & \hspace{-2mm} = \hspace{-2mm} & {a_{}}_{\mathrm M} {a_{}}_{\mathrm P} + {b_{}}_{\mathrm M} {b_{}}_{\mathrm P} + {c_{}}_{\mathrm M} {c_{}}_{\mathrm P} \end{align*} and \begin{align*} {a_{}}_{\mathrm M} & \hspace{-2mm} = \hspace{-2mm} & \sin {\phi_{}}_{\mathrm M}, \\ {a_{}}_{\mathrm P} & \hspace{-2mm} = \hspace{-2mm} & \sin {\phi_{}}_{\mathrm P}, \\ {b_{}}_{\mathrm M} & \hspace{-2mm} = \hspace{-2mm} & \cos {\phi_{}}_{\mathrm M} \cos {\phi_{}}_{\mathrm M}, \\ {b_{}}_{\mathrm P} & \hspace{-2mm} = \hspace{-2mm} & \cos {\phi_{}}_{\mathrm P} \cos {\phi_{}}_{\mathrm P}, \\ {c_{}}_{\mathrm M} & \hspace{-2mm} = \hspace{-2mm} & \cos {\phi_{}}_{\mathrm M} \sin {\phi_{}}_{\mathrm M}, \\ {c_{}}_{\mathrm P} & \hspace{-2mm} = \hspace{-2mm} & \cos {\phi_{}}_{\mathrm P} \sin {\phi_{}}_{\mathrm P}. \end{align*} \begin{alignat*}{2} x = {a_{}}_{\mathrm M} {a_{}}_{\mathrm P} + {b_{}}_{\mathrm M} {b_{}}_{\mathrm P} + {c_{}}_{\mathrm M} {c_{}}_{\mathrm P} \end{alignat*} and \begin{alignat*}{3} & {a_{}}_{\mathrm M} & = && \quad \sin {\phi_{}}_{\mathrm M}, \\ & {a_{}}_{\mathrm P} & = && \quad \sin {\phi_{}}_{\mathrm P}, \\ & {b_{}}_{\mathrm M} & = && \quad \cos {\phi_{}}_{\mathrm M} \cos {\phi_{}}_{\mathrm M}, \\ & {b_{}}_{\mathrm P} & = && \quad \cos {\phi_{}}_{\mathrm P} \cos {\phi_{}}_{\mathrm P}, \\ & {c_{}}_{\mathrm M} & = && \quad \cos {\phi_{}}_{\mathrm M} \sin {\phi_{}}_{\mathrm M}, \\ & {c_{}}_{\mathrm P} & = && \quad \cos {\phi_{}}_{\mathrm P} \sin {\phi_{}}_{\mathrm P}. \end{alignat*} \item[2.] {\bfseries Great-Circle distance-weighted interpolation with small angle approximation.} Similar to the previous interpolation but with the distance $s$ computed as \begin{align*} \begin{alignat*}{2} s\left( {\mathrm P}, {\mathrm M} \right) & \hspace{-2mm} = \hspace{-2mm} & \sqrt{ \left( {\phi_{}}_{\mathrm M} - {\phi_{}}_{\mathrm P} \right)^{2} & = & \sqrt{ \left( {\phi_{}}_{\mathrm M} - {\phi_{}}_{\mathrm P} \right)^{2} + \left( {\lambda_{}}_{\mathrm M} - {\lambda_{}}_{\mathrm P} \right)^{2} \cos^{2} {\phi_{}}_{\mathrm M} } \end{align*} \end{alignat*} where $M$ corresponds to $A$, $B$, $C$ or $D$. a cell with coordinates (0,0), (1,0), (0,1) and (1,1). This method is based on the \href{https://github.com/SCRIP-Project/SCRIP}{SCRIP interpolation package}. \end{enumerate} \item The standard grid-searching code is used to find the nearest model grid point to the observation location (see next subsection). \item The maximum number of grid points is calculated in the local grid domain for which the averaging is likely need to cover. \item The lats/longs of the grid points surrounding the nearest model grid box are extracted using existing mpi routines. \item The maximum number of grid points required for that observation in each local grid domain is calculated. Some of these points may later turn out to have zero weight depending on the shape of the footprint. \item The longitudes and latitudes of the grid points surrounding the nearest model grid box are extracted using existing MPI routines. \item The weights for each grid point associated with each observation are calculated, either for radial or rectangular footprints. Examples of the weights calculated for an observation with rectangular and radial footprints are shown in Figs.~\autoref{fig:obsavgrec} and~\autoref{fig:obsavgrad}. \autoref{fig:obsavgrec} and~\autoref{fig:obsavgrad}. %>>>>>>>>>>>>>>>>>>>>>>>>>>>> Weights associated with each model grid box (blue lines and numbers) for an observation at -170.5\deg{E}, 56.0\deg{N} with a radial footprint with diameter 1\deg. } } \end{center} \end{figure} \subsection{Grid search} For many grids used by the NEMO model, such as the ORCA family, the horizontal grid coordinates $i$ and $j$ are not simple functions of latitude and longitude. For many grids used by the \NEMO model, such as the ORCA family, the horizontal grid coordinates $i$ and $j$ are not simple functions of latitude and longitude. Therefore, it is not always straightforward to determine the grid points surrounding any given observational position. Before the interpolation can be performed, a search algorithm is then required to determine the corner points of Before the interpolation can be performed, a search algorithm is then required to determine the corner points of the quadrilateral cell in which the observation is located. This is the most difficult and time consuming part of the 2D interpolation procedure. This is the most difficult and time consuming part of the 2D interpolation procedure. A robust test for determining if an observation falls within a given quadrilateral cell is as follows. Let ${\mathrm P}({\lambda_{}}_{\mathrm P} ,{\phi_{}}_{\mathrm P} )$ denote the observation point, and let ${\mathrm A}({\lambda_{}}_{\mathrm A} ,{\phi_{}}_{\mathrm A} )$, ${\mathrm B}({\lambda_{}}_{\mathrm B} ,{\phi_{}}_{\mathrm B} )$, ${\mathrm C}({\lambda_{}}_{\mathrm C} ,{\phi_{}}_{\mathrm C} )$ and ${\mathrm D}({\lambda_{}}_{\mathrm D} ,{\phi_{}}_{\mathrm D} )$ denote the bottom left, bottom right, top left and top right corner points of the cell, respectively. To determine if P is inside the cell, we verify that the cross-products denote the bottom left, bottom right, top left and top right corner points of the cell, respectively. To determine if P is inside the cell, we verify that the cross-products \begin{align*} \begin{array}{lllll} be searched for on a regular grid. For each observation position, the closest point on the regular grid of this position is computed and the $i$ and $j$ ranges of this point searched to determine the precise four points surrounding the observation. the $i$ and $j$ ranges of this point searched to determine the precise four points surrounding the observation. \subsection{Parallel aspects of horizontal interpolation} For horizontal interpolation, there is the basic problem that the observations are unevenly distributed on the globe. In numerical models, it is common to divide the model grid into subgrids (or domains) where In \NEMO the model grid is divided into subgrids (or domains) where each subgrid is executed on a single processing element with explicit message passing for exchange of information along the domain boundaries when running on a massively parallel processor (MPP) system. This approach is used by \NEMO. For observations there is no natural distribution since the observations are not equally distributed on the globe. For observations there is no natural distribution since the observations are not equally distributed on the globe. Two options have been made available: 1) geographical distribution; the domain of the grid-point parallelization. \autoref{fig:obslocal} shows an example of the distribution of the {\em in situ} data on processors with a different colour for each observation on a given processor for a 4 $\times$ 2 decomposition with ORCA2. a different colour for each observation on a given processor for a 4 $\times$ 2 decomposition with ORCA2. The grid-point domain decomposition is clearly visible on the plot. The advantage of this approach is that all information needed for horizontal interpolation is available without any MPP communication. Of course, this is under the assumption that we are only using a $2 \times 2$ grid-point stencil for This is under the assumption that we are dealing with point observations and only using a $2 \times 2$ grid-point stencil for the interpolation (\eg bilinear interpolation). For higher order interpolation schemes this is no longer valid. At the bottom boundary, this is done using the land-ocean mask. For profile observation types we do both vertical and horizontal interpolation. \NEMO has a generalised vertical coordinate system this means the vertical level depths can vary with location. Therefore, it is necessary first to perform vertical interpolation of the model value to the observation depths for each of the four surrounding grid points. After this the model values, at these points, at the observation depth, are horizontally interpolated to the observation location. \newpage % ================================================================ % Offline observation operator documentation % Standalone observation operator documentation % ================================================================ %\usepackage{framed} \section{Offline observation operator} \label{sec:OBS_ooo} \section{Standalone observation operator} \label{sec:OBS_sao} \subsection{Concept} The obs oper maps model variables to observation space. It is possible to apply this mapping without running the model. The software which performs this functionality is known as the \textbf{offline obs oper}. The obs oper is divided into three stages. An initialisation phase, an interpolation phase and an output phase. The implementation of which is outlined in the previous sections. During the interpolation phase the offline obs oper populates the model arrays by reading saved model fields from disk. There are two ways of exploiting this offline capacity. The observation operator maps model variables to observation space. This is normally done while the model is running, i.e. online, it is possible to apply this mapping offline without running the model with the \textbf{standalone observation operator} (SAO). The process is divided into an initialisation phase, an interpolation phase and an output phase. During the interpolation phase the SAO populates the model arrays by reading saved model fields from disk. The interpolation and the output phases use the same OBS code described in the preceding sections. There are two ways of exploiting the standalone capacity. The first is to mimic the behaviour of the online system by supplying model fields at regular intervals between the start and the end of the run. This approach results in a single model counterpart per observation. This kind of usage produces feedback files the same file format as the online obs oper. The second is to take advantage of the offline setting in which multiple model counterparts can be calculated per observation. This kind of usage produces feedback files the same file format as the online observation operator. The second is to take advantage of the ability to run offline by calculating multiple model counterparts for each observation. In this case it is possible to consider all forecasts verifying at the same time. By forecast, I mean any method which produces an estimate of physical reality which is not an observed value. In the case of class 4 files this means forecasts, analyses, persisted analyses and climatological values verifying at the same time. Although the class 4 file format doesn't account for multiple ensemble members or multiple experiments per observation, it is possible to include these components in the same or multiple files. By forecast, we mean any method which produces an estimate of physical reality which is not an observed value. %-------------------------------------------------------------------------------------------------------- % offline_oper.exe % sao.exe %-------------------------------------------------------------------------------------------------------- \subsection{Using the offline observation operator} \subsection{Using the standalone observation operator} \subsubsection{Building} In addition to \emph{OPA\_SRC} the offline obs oper requires the inclusion of the \emph{OOO\_SRC} directory. \emph{OOO\_SRC} contains a replacement \mdl{nemo} and \mdl{nemogcm} which In addition to \emph{OPA\_SRC} the SAO requires the inclusion of the \emph{SAO\_SRC} directory. \emph{SAO\_SRC} contains a replacement \mdl{nemo} and \mdl{nemogcm} which overwrites the resultant \textbf{nemo.exe}. This is the approach taken by \emph{SAS\_SRC} and \emph{OFF\_SRC}. Note this a similar approach to that taken by the standalone surface scheme \emph{SAS\_SRC} and the offline TOP model \emph{OFF\_SRC}. %-------------------------------------------------------------------------------------------------------- % Running % Running %-------------------------------------------------------------------------------------------------------- \subsubsection{Running} The simplest way to use the executable is to edit and append the \textbf{ooo.nml} namelist to a full NEMO namelist and then to run the executable as if it were nemo.exe. \subsubsection{Quick script} A useful Python utility to control the namelist options can be found in \textbf{OBSTOOLS/OOO}. The functions which locate model fields and observation files can be manually specified. The package can be installed by appropriate use of the included setup.py script. Documentation can be auto-generated by Sphinx by running \emph{make html} in the \textbf{doc} directory. The simplest way to use the executable is to edit and append the \textbf{sao.nml} namelist to a full \NEMO namelist and then to run the executable as if it were nemo.exe. %-------------------------------------------------------------------------------------------------------- % Configuration section %-------------------------------------------------------------------------------------------------------- \subsection{Configuring the offline observation operator} The observation files and settings understood by \textbf{namobs} have been outlined in the online obs oper section. In addition there are two further namelists wich control the operation of the offline obs oper. \textbf{namooo} which controls the input model fields and \textbf{namcl4} which controls the production of class 4 files. \subsection{Configuring the standalone observation operator} The observation files and settings understood by \ngn{namobs} have been outlined in the online observation operator section. In addition is a further namelist \ngn{namsao} which used to set the input model fields for the SAO \subsubsection{Single field} In offline mode model arrays are populated at appropriate time steps via input files. At present, \textbf{tsn} and \textbf{sshn} are populated by the default read routines. In the SAO the model arrays are populated at appropriate time steps via input files. At present, \textbf{tsn} and \textbf{sshn} are populated by the default read routines. These routines will be expanded upon in future versions to allow the specification of any model variable. As such, input files must be global versions of the model domain with \textbf{votemper}, \textbf{vosaline} and optionally \textbf{sshn} present. For each field read there must be an entry in the \textbf{namooo} namelist specifying For each field read there must be an entry in the \ngn{namsao} namelist specifying the name of the file to read and the index along the \emph{time\_counter}. For example, to read the second time counter from a single file the namelist would be. \begin{forlines} !---------------------------------------------------------------------- !       namooo Offline obs_oper namelist !       namsao Standalone obs_oper namelist !---------------------------------------------------------------------- !   ooo_files    specifies the files containing the model counterpart !   nn_ooo_idx   specifies the time_counter index within the model file &namooo ooo_files = "foo.nc" nn_ooo_idx = 2 !   sao_files    specifies the files containing the model counterpart !   nn_sao_idx   specifies the time_counter index within the model file &namsao sao_files = "foo.nc" nn_sao_idx = 2 / \end{forlines} \subsubsection{Multiple fields per run} Model field iteration is controlled via \textbf{nn\_ooo\_freq} which Model field iteration is controlled via \textbf{nn\_sao\_freq} which specifies the number of model steps at which the next field gets read. For example, if 12 hourly fields are to be interpolated in a setup where 288 steps equals 24 hours. \begin{forlines} !---------------------------------------------------------------------- !       namooo Offline obs_oper namelist !       namsao Standalone obs_oper namelist !---------------------------------------------------------------------- !   ooo_files    specifies the files containing the model counterpart !   nn_ooo_idx   specifies the time_counter index within the model file !   nn_ooo_freq  specifies number of time steps between read operations &namooo ooo_files = "foo.nc" "foo.nc" nn_ooo_idx = 1 2 nn_ooo_freq = 144 !   sao_files    specifies the files containing the model counterpart !   nn_sao_idx   specifies the time_counter index within the model file !   nn_sao_freq  specifies number of time steps between read operations &namsao sao_files = "foo.nc" "foo.nc" nn_sao_idx = 1 2 nn_sao_freq = 144 / \end{forlines} %\end{framed} It is easy to see how a collection of fields taken fron a number of files at different indices can be combined at A collection of fields taken from a number of files at different indices can be combined at a particular frequency in time to generate a pseudo model evolution. As long as all that is needed is a single model counterpart at a regular interval then namooo is all that needs to be edited. However, a far more interesting approach can be taken in which multiple forecasts, analyses, persisted analyses and climatologies are considered against the same set of observations. For this a slightly more complicated approach is needed. It is referred to as \emph{Class 4} since it is the fourth metric defined by the GODAE intercomparison project. %-------------------------------------------------------------------------------------------------------- % Class 4 file section %-------------------------------------------------------------------------------------------------------- \subsubsection{Multiple model counterparts per observation a.k.a Class 4} A generalisation of feedback files to allow multiple model components per observation. For a single observation, as well as previous forecasts verifying at the same time there are also analyses, persisted analyses and climatologies. The above namelist performs two basic functions. It organises the fields given in \textbf{namooo} into groups so that observations can be matched up multiple times. It also controls the metadata and the output variable of the class 4 file when a write routine is called. %\begin{framed} \textbf{Note: ln\_cl4} must be set to \forcode{.true.} in \textbf{namobs} to use class 4 outputs. %\end{framed} \subsubsection{Class 4 naming convention} The standard class 4 file naming convention is as follows. \noindent \linebreak \textbf{\$\{prefix\}\_\$\{yyyymmdd\}\_\$\{sys\}\_\$\{cfg\}\_\$\{vn\}\_\$\{kind\}\_\\$\{nproc\}}.nc \noindent \linebreak Much of the namelist is devoted to specifying this convention. The following namelist settings control the elements of the output file names. Each should be specified as a single string of character data. \begin{description} \item[cl4\_prefix] Prefix for class 4 files \eg class4 \item[cl4\_date] YYYYMMDD validity date \item[cl4\_sys] The name of the class 4 model system \eg FOAM \item[cl4\_cfg] The name of the class 4 model configuration \eg orca025 \item[cl4\_vn] The name of the class 4 model version \eg 12.0 \end{description} \noindent The kind is specified by the observation type internally to the obs oper. The processor number is specified internally in NEMO. \subsubsection{Class 4 file global attributes} Global attributes necessary to fulfill the class 4 file definition. These are also useful pieces of information when collaborating with external partners. \begin{description} \item[cl4\_contact] Contact email for class 4 files. \item[cl4\_inst] The name of the producers institution. \item[cl4\_cfg] The name of the class 4 model configuration \eg orca025 \item[cl4\_vn] The name of the class 4 model version \eg 12.0 \end{description} \noindent The obs\_type, creation date and validity time are specified internally to the obs oper. \subsubsection{Class 4 model counterpart configuration} As seen previously it is possible to perform a single sweep of the obs oper and specify a collection of model fields equally spaced along that sweep. In the class 4 case the single sweep is replaced with multiple sweeps and a certain ammount of book keeping is needed to ensure each model counterpart makes its way to the correct piece of memory in the output files. \noindent \linebreak In terms of book keeping, the offline obs oper needs to know how many full sweeps need to be performed. This is specified via the \textbf{cl4\_match\_len} variable and is the total number of model counterparts per observation. For example, a 3 forecasts plus 3 persistence fields plus an analysis field would be 7 counterparts per observation. \begin{forlines} cl4_match_len = 7 \end{forlines} Then to correctly allocate a class 4 file the forecast axis must be defined. This is controlled via \textbf{cl4\_fcst\_len}, which in out above example would be 3. \begin{forlines} cl4_fcst_len = 3 \end{forlines} Then for each model field it is necessary to designate what class 4 variable and index along the forecast dimension the model counterpart should be stored in the output file. As well as a value for that lead time in hours, this will be useful when interpreting the data afterwards. \begin{forlines} cl4_vars = "forecast" "forecast" "forecast" "persistence" "persistence" "persistence" "best_estimate" cl4_fcst_idx = 1 2 3 1 2 3 1 cl4_leadtime = 12 36 60 \end{forlines} In terms of files and indices of fields inside each file the class 4 approach makes use of the \textbf{namooo} namelist. If our fields are in separate files with a single field per file our example inputs will be specified. \begin{forlines} ooo_files = "F.1.nc" "F.2.nc" "F.3.nc" "P.1.nc" "P.2.nc" "P.3.nc" "A.1.nc" nn_ooo_idx = 1 1 1 1 1 1 1 \end{forlines} When we combine all of the naming conventions, global attributes and i/o instructions the class 4 namelist becomes. \begin{forlines} !---------------------------------------------------------------------- !       namooo Offline obs_oper namelist !---------------------------------------------------------------------- !   ooo_files    specifies the files containing the model counterpart !   nn_ooo_idx   specifies the time_counter index within the model file !   nn_ooo_freq  specifies number of time steps between read operations &namooo ooo_files = "F.1.nc" "F.2.nc" "F.3.nc" "P.1.nc" "P.2.nc" "P.3.nc" "A.1.nc" nn_ooo_idx = 1 1 1 1 1 1 1 / !---------------------------------------------------------------------- !       namcl4 Offline obs_oper class 4 namelist !---------------------------------------------------------------------- ! !  Naming convention !  ----------------- !  cl4_prefix    specifies the output file prefix !  cl4_date      specifies the output file validity date !  cl4_sys       specifies the model counterpart system !  cl4_cfg       specifies the model counterpart configuration !  cl4_vn        specifies the model counterpart version !  cl4_inst      specifies the model counterpart institute !  cl4_contact   specifies the file producers contact details ! !  I/O specification !  ----------------- !  cl4_vars      specifies the names of the output file netcdf variable !  cl4_fcst_idx  specifies output file forecast index !  cl4_fcst_len  specifies forecast axis length !  cl4_match_len specifies number of unique matches per observation !  cl4_leadtime  specifies the forecast axis lead time ! &namcl4 cl4_match_len = 7 cl4_fcst_len = 3 cl4_fcst_idx = 1 2 3 1 2 3 1 cl4_vars = "forecast" "forecast" "forecast" "persistence" "persistence" "persistence" "best_estimate" cl4_leadtime = 12 36 60 cl4_prefix = "class4" cl4_date = "20130101" cl4_vn = "12.0" cl4_sys = "FOAM" cl4_cfg = "AMM7" cl4_contact = "example@example.com" cl4_inst = "UK Met Office" / \end{forlines} \subsubsection{Climatology interpolation} The climatological counterpart is generated at the start of the run by restarting the model from climatology through appropriate use of \textbf{namtsd}. To override the offline observation operator read routine and to take advantage of the restart settings, specify the first entry in \textbf{cl4\_vars} as "climatology". This will then pipe the restart from climatology into the output class 4 file. As in every other class 4 matchup the input file, input index and output index must be specified. These can be replaced with dummy data since they are not used but they must be present to cycle through the matchups correctly. \subsection{Advanced usage} In certain cases it may be desirable to include both multiple model fields per observation window with multiple match ups per observation. This can be achieved by specifying \textbf{nn\_ooo\_freq} as well as the class 4 settings. Care must be taken in generating the ooo\_files list such that the files are arranged into consecutive blocks of single match ups. For example, 2 forecast fields of 12 hourly data would result in 4 separate read operations but only 2 write operations, 1 per forecast. \begin{forlines} ooo_files = "F1.nc" "F1.nc" "F2.nc" "F2.nc" ... cl4_fcst_idx = 1 2 \end{forlines} The above notation reveals the internal split between match up iterators and file iterators. This technique has not been used before so experimentation is needed before results can be trusted. If all that is needed is a single model counterpart at a regular interval then the standard SAO is all that is required. However, just to note, it is possible to extend this approach by comparing multiple forecasts, analyses, persisted analyses and climatologies with the same set of observations. This approach is referred to as \emph{Class 4} since it is the fourth metric defined by the GODAE intercomparison project. This requires multiple runs of the SAO and running an additional utility (not currently in the \NEMO repository) to combine the feedback files into one class 4 file. \newpage \label{sec:OBS_obsutils} Some tools for viewing and processing of observation and feedback files are provided in the NEMO repository for convenience. These include OBSTOOLS which are a collection of \fortran programs which are helpful to deal with feedback files. For convenience some tools for viewing and processing of observation and feedback files are provided in the \NEMO repository. These tools include OBSTOOLS which are a collection of \fortran programs which are helpful to deal with feedback files. They do such tasks as observation file conversion, printing of file contents, some basic statistical analysis of feedback files. The other tool is an IDL program called dataplot which uses a graphical interface to The other main tool is an IDL program called dataplot which uses a graphical interface to visualise observations and feedback files. OBSTOOLS and dataplot are described in more detail below. \subsection{Obstools} A series of \fortran utilities is provided with NEMO called OBSTOOLS. This are helpful in handling observation files and the feedback file output from the NEMO observation operator. The utilities are as follows \subsubsection{c4comb} The program c4comb combines multiple class 4 files produced by individual processors in an MPI run of NEMO offline obs\_oper into a single class 4 file. The program is called in the following way: \footnotesize \begin{cmds} c4comb.exe outputfile inputfile1 inputfile2 ... \end{cmds} A series of \fortran utilities is provided with \NEMO called OBSTOOLS. This are helpful in handling observation files and the feedback file output from the observation operator. A brief description of some of the utilities follows \subsubsection{corio2fb} The program corio2fb converts profile observation files from the Coriolis format to the standard feedback format. The program is called in the following way: \footnotesize It is called in the following way: \begin{cmds} corio2fb.exe outputfile inputfile1 inputfile2 ... The program enact2fb converts profile observation files from the ENACT format to the standard feedback format. The program is called in the following way: \footnotesize It is called in the following way: \begin{cmds} enact2fb.exe outputfile inputfile1 inputfile2 ... The program fbcomb combines multiple feedback files produced by individual processors in an MPI run of NEMO into a single feedback file. The program is called in the following way: \footnotesize an MPI run of \NEMO into a single feedback file. It is called in the following way: \begin{cmds} fbcomb.exe outputfile inputfile1 inputfile2 ... The program fbmatchup will match observations from two feedback files. The program is called in the following way: \footnotesize It is called in the following way: \begin{cmds} fbmatchup.exe outputfile inputfile1 varname1 inputfile2 varname2 ... The program fbprint will print the contents of a feedback file or files to standard output. Selected information can be output using optional arguments. The program is called in the following way: \footnotesize It is called in the following way: \begin{cmds} fbprint.exe [options] inputfile -B            Select observations based on QC flags -u            unsorted -s ID         select station ID -s ID         select station ID -t TYPE       select observation type -v NUM1-NUM2  select variable range to print by number -v NUM1-NUM2  select variable range to print by number (default all) -a NUM1-NUM2  select additional variable range to print by number -a NUM1-NUM2  select additional variable range to print by number (default all) -e NUM1-NUM2  select extra variable range to print by number -e NUM1-NUM2  select extra variable range to print by number (default all) -d            output date range The program fbsel will select or subsample observations. The program is called in the following way: \footnotesize It is called in the following way: \begin{cmds} fbsel.exe The program fbstat will output summary statistics in different global areas into a number of files. The program is called in the following way: \footnotesize It is called in the following way: \begin{cmds} fbstat.exe [-nmlev] The program fbthin will thin the data to 1 degree resolution. The code could easily be modified to thin to a different resolution. The program is called in the following way: \footnotesize It is called in the following way: \begin{cmds} fbthin.exe inputfile outputfile The program sla2fb will convert an AVISO SLA format file to feedback format. The program is called in the following way: \footnotesize It is called in the following way: \begin{cmds} sla2fb.exe [-s type] outputfile inputfile1 inputfile2 ... The program vel2fb will convert TAO/PIRATA/RAMA currents files to feedback format. The program is called in the following way: \footnotesize It is called in the following way: \begin{cmds} vel2fb.exe outputfile inputfile1 inputfile2 ... An IDL program called dataplot is included which uses a graphical interface to visualise observations and feedback files. visualise observations and feedback files. Note a similar package has recently developed in python (also called dataplot) which does some of the same things that the IDL dataplot does. Please contact the authors of the this chapter if you are interested in this. It is possible to zoom in, plot individual profiles and calculate some basic statistics. To plot some data run IDL and then: \footnotesize \begin{minted}{idl} IDL> dataplot, "filename" for example multiple feedback files from different processors or from different days, the easiest method is to use the spawn command to generate a list of files which can then be passed to dataplot. \footnotesize \begin{minted}{idl} IDL> spawn, 'ls profb*.nc', files The plotting colour range can be changed by clicking on the colour bar. The title of the plot gives some basic information about the date range and depth range shown, the extreme values, and the mean and rms values. the extreme values, and the mean and RMS values. It is possible to zoom in using a drag-box. You may also zoom in or out using the mouse wheel. observation minus background value. The next group of radio buttons selects the map projection. This can either be regular latitude longitude grid, or north or south polar stereographic. This can either be regular longitude latitude grid, or north or south polar stereographic. The next group of radio buttons will plot bad observations, switch to salinity and plot density for profile observations.