[10419] | 1 | \documentclass[../main/NEMO_manual]{subfiles} |
---|
| 2 | |
---|
[6997] | 3 | \begin{document} |
---|
[707] | 4 | % ================================================================ |
---|
[10419] | 5 | % Chapter --- Miscellaneous Topics |
---|
[707] | 6 | % ================================================================ |
---|
[2282] | 7 | \chapter{Miscellaneous Topics} |
---|
[9407] | 8 | \label{chap:MISC} |
---|
[10419] | 9 | |
---|
[707] | 10 | \minitoc |
---|
| 11 | |
---|
[2282] | 12 | \newpage |
---|
| 13 | |
---|
[707] | 14 | % ================================================================ |
---|
| 15 | % Representation of Unresolved Straits |
---|
| 16 | % ================================================================ |
---|
[9393] | 17 | \section{Representation of unresolved straits} |
---|
[9407] | 18 | \label{sec:MISC_strait} |
---|
[707] | 19 | |
---|
[10368] | 20 | In climate modeling, it often occurs that a crucial connections between water masses is broken as |
---|
| 21 | the grid mesh is too coarse to resolve narrow straits. |
---|
| 22 | For example, coarse grid spacing typically closes off the Mediterranean from the Atlantic at |
---|
| 23 | the Strait of Gibraltar. |
---|
| 24 | In this case, it is important for climate models to include the effects of salty water entering the Atlantic from |
---|
| 25 | the Mediterranean. |
---|
| 26 | Likewise, it is important for the Mediterranean to replenish its supply of water from the Atlantic to |
---|
| 27 | balance the net evaporation occurring over the Mediterranean region. |
---|
| 28 | This problem occurs even in eddy permitting simulations. |
---|
| 29 | For example, in ORCA 1/4\deg several straits of the Indonesian archipelago (Ombai, Lombok...) |
---|
| 30 | are much narrow than even a single ocean grid-point. |
---|
[707] | 31 | |
---|
[10368] | 32 | We describe briefly here the three methods that can be used in \NEMO to handle such improperly resolved straits. |
---|
| 33 | The first two consist of opening the strait by hand while ensuring that the mass exchanges through |
---|
| 34 | the strait are not too large by either artificially reducing the surface of the strait grid-cells or, |
---|
| 35 | locally increasing the lateral friction. |
---|
| 36 | In the third one, the strait is closed but exchanges of mass, heat and salt across the land are allowed. |
---|
| 37 | Note that such modifications are so specific to a given configuration that no attempt has been made to |
---|
| 38 | set them in a generic way. |
---|
| 39 | However, examples of how they can be set up is given in the ORCA 2\deg and 0.5\deg configurations. |
---|
| 40 | For example, for details of implementation in ORCA2, search: \texttt{IF( cp\_cfg == "orca" .AND. jp\_cfg == 2 )} |
---|
[707] | 41 | |
---|
| 42 | % ------------------------------------------------------------------------------------------------------------- |
---|
| 43 | % Hand made geometry changes |
---|
| 44 | % ------------------------------------------------------------------------------------------------------------- |
---|
| 45 | \subsection{Hand made geometry changes} |
---|
[9407] | 46 | \label{subsec:MISC_strait_hand} |
---|
[707] | 47 | |
---|
[10368] | 48 | $\bullet$ reduced scale factor in the cross-strait direction to a value in better agreement with |
---|
| 49 | the true mean width of the strait (\autoref{fig:MISC_strait_hand}). |
---|
[2282] | 50 | This technique is sometime called "partially open face" or "partially closed cells". |
---|
[10368] | 51 | The key issue here is only to reduce the faces of $T$-cell |
---|
| 52 | ($i.e.$ change the value of the horizontal scale factors at $u$- or $v$-point) but not the volume of the $T$-cell. |
---|
| 53 | Indeed, reducing the volume of strait $T$-cell can easily produce a numerical instability at |
---|
| 54 | that grid point that would require a reduction of the model time step. |
---|
| 55 | The changes associated with strait management are done in \mdl{domhgr}, |
---|
[2282] | 56 | just after the definition or reading of the horizontal scale factors. |
---|
[707] | 57 | |
---|
[10368] | 58 | $\bullet$ increase of the viscous boundary layer thickness by local increase of the fmask value at the coast |
---|
| 59 | (\autoref{fig:MISC_strait_hand}). |
---|
| 60 | This is done in \mdl{dommsk} together with the setting of the coastal value of fmask (see \autoref{sec:LBC_coast}). |
---|
[994] | 61 | |
---|
[707] | 62 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
[10368] | 63 | \begin{figure}[!tbp] |
---|
| 64 | \begin{center} |
---|
| 65 | \includegraphics[width=0.80\textwidth]{Fig_Gibraltar} |
---|
| 66 | \includegraphics[width=0.80\textwidth]{Fig_Gibraltar2} |
---|
[10419] | 67 | \caption{ |
---|
| 68 | \protect\label{fig:MISC_strait_hand} |
---|
[10368] | 69 | Example of the Gibraltar strait defined in a $1^{\circ} \times 1^{\circ}$ mesh. |
---|
| 70 | \textit{Top}: using partially open cells. |
---|
| 71 | The meridional scale factor at $v$-point is reduced on both sides of the strait to account for |
---|
| 72 | the real width of the strait (about 20 km). |
---|
| 73 | Note that the scale factors of the strait $T$-point remains unchanged. |
---|
| 74 | \textit{Bottom}: using viscous boundary layers. |
---|
| 75 | The four fmask parameters along the strait coastlines are set to a value larger than 4, |
---|
| 76 | $i.e.$ "strong" no-slip case (see \autoref{fig:LBC_shlat}) creating a large viscous boundary layer that |
---|
[10419] | 77 | allows a reduced transport through the strait. |
---|
| 78 | } |
---|
[10368] | 79 | \end{center} |
---|
| 80 | \end{figure} |
---|
[707] | 81 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
| 82 | |
---|
| 83 | |
---|
| 84 | % ================================================================ |
---|
| 85 | % Closed seas |
---|
| 86 | % ================================================================ |
---|
[9363] | 87 | \section{Closed seas (\protect\mdl{closea})} |
---|
[9407] | 88 | \label{sec:MISC_closea} |
---|
[707] | 89 | |
---|
[2282] | 90 | \colorbox{yellow}{Add here a short description of the way closed seas are managed} |
---|
[707] | 91 | |
---|
[2282] | 92 | |
---|
[707] | 93 | % ================================================================ |
---|
[9019] | 94 | % Sub-Domain Functionality |
---|
[707] | 95 | % ================================================================ |
---|
[9393] | 96 | \section{Sub-domain functionality} |
---|
[9407] | 97 | \label{sec:MISC_zoom} |
---|
[707] | 98 | |
---|
[9393] | 99 | \subsection{Simple subsetting of input files via NetCDF attributes} |
---|
[5118] | 100 | |
---|
[10368] | 101 | The extended grids for use with the under-shelf ice cavities will result in redundant rows around Antarctica if |
---|
| 102 | the ice cavities are not active. |
---|
| 103 | A simple mechanism for subsetting input files associated with the extended domains has been implemented to |
---|
| 104 | avoid the need to maintain different sets of input fields for use with or without active ice cavities. |
---|
| 105 | The existing 'zoom' options are overly complex for this task and marked for deletion anyway. |
---|
| 106 | This alternative subsetting operates for the j-direction only and works by optionally looking for and |
---|
| 107 | using a global file attribute (named: \np{open\_ocean\_jstart}) to determine the starting j-row for input. |
---|
| 108 | The use of this option is best explained with an example: |
---|
| 109 | consider an ORCA1 configuration using the extended grid bathymetry and coordinate files: |
---|
[5118] | 110 | \vspace{-10pt} |
---|
[9392] | 111 | \ifile{eORCA1\_bathymetry\_v2} |
---|
| 112 | \ifile{eORCA1\_coordinates} |
---|
[10368] | 113 | \noindent These files define a horizontal domain of 362x332. |
---|
| 114 | Assuming the first row with open ocean wet points in the non-isf bathymetry for this set is row 42 |
---|
| 115 | (Fortran indexing) then the formally correct setting for \np{open\_ocean\_jstart} is 41. |
---|
| 116 | Using this value as the first row to be read will result in a 362x292 domain which is the same size as |
---|
| 117 | the original ORCA1 domain. |
---|
| 118 | Thus the extended coordinates and bathymetry files can be used with all the original input files for ORCA1 if |
---|
| 119 | the ice cavities are not active (\np{ln\_isfcav = .false.}). |
---|
| 120 | Full instructions for achieving this are: |
---|
[5118] | 121 | |
---|
| 122 | \noindent Add the new attribute to any input files requiring a j-row offset, i.e: |
---|
| 123 | \vspace{-10pt} |
---|
[9388] | 124 | \begin{cmds} |
---|
[5118] | 125 | ncatted -a open_ocean_jstart,global,a,d,41 eORCA1_coordinates.nc |
---|
| 126 | ncatted -a open_ocean_jstart,global,a,d,41 eORCA1_bathymetry_v2.nc |
---|
[9388] | 127 | \end{cmds} |
---|
[5118] | 128 | |
---|
| 129 | \noindent Add the logical switch to \ngn{namcfg} in the configuration namelist and set true: |
---|
| 130 | %--------------------------------------------namcfg-------------------------------------------------------- |
---|
[10146] | 131 | |
---|
| 132 | \nlst{namcfg} |
---|
[5118] | 133 | %-------------------------------------------------------------------------------------------------------------- |
---|
| 134 | |
---|
[10368] | 135 | \noindent Note the j-size of the global domain is the (extended j-size minus \np{open\_ocean\_jstart} + 1 ) and |
---|
| 136 | this must match the size of all datasets other than bathymetry and coordinates currently. |
---|
| 137 | However the option can be extended to any global, 2D and 3D, netcdf, input field by adding the: |
---|
[5118] | 138 | \vspace{-10pt} |
---|
[9388] | 139 | \begin{forlines} |
---|
[5118] | 140 | lrowattr=ln_use_jattr |
---|
[9388] | 141 | \end{forlines} |
---|
[10368] | 142 | optional argument to the appropriate \np{iom\_get} call and the \np{open\_ocean\_jstart} attribute to |
---|
| 143 | the corresponding input files. |
---|
| 144 | It remains the users responsibility to set \np{jpjdta} and \np{jpjglo} values in |
---|
| 145 | the \np{namelist\_cfg} file according to their needs. |
---|
[5118] | 146 | |
---|
[707] | 147 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
[10368] | 148 | \begin{figure}[!ht] |
---|
| 149 | \begin{center} |
---|
| 150 | \includegraphics[width=0.90\textwidth]{Fig_LBC_zoom} |
---|
[10419] | 151 | \caption{ |
---|
| 152 | \protect\label{fig:LBC_zoom} |
---|
[10368] | 153 | Position of a model domain compared to the data input domain when the zoom functionality is used. |
---|
| 154 | } |
---|
| 155 | \end{center} |
---|
| 156 | \end{figure} |
---|
[707] | 157 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
| 158 | |
---|
| 159 | |
---|
| 160 | % ================================================================ |
---|
[2541] | 161 | % Accuracy and Reproducibility |
---|
| 162 | % ================================================================ |
---|
[9393] | 163 | \section{Accuracy and reproducibility (\protect\mdl{lib\_fortran})} |
---|
[9407] | 164 | \label{sec:MISC_fortran} |
---|
[2541] | 165 | |
---|
[9363] | 166 | \subsection{Issues with intrinsinc SIGN function (\protect\key{nosignedzero})} |
---|
[9407] | 167 | \label{subsec:MISC_sign} |
---|
[2541] | 168 | |
---|
[10368] | 169 | The SIGN(A, B) is the \textsc {Fortran} intrinsic function delivers the magnitude of A with the sign of B. |
---|
| 170 | For example, SIGN(-3.0,2.0) has the value 3.0. |
---|
| 171 | The problematic case is when the second argument is zero, because, on platforms that support IEEE arithmetic, |
---|
| 172 | zero is actually a signed number. |
---|
[2541] | 173 | There is a positive zero and a negative zero. |
---|
| 174 | |
---|
[10368] | 175 | In \textsc{Fortran}~90, the processor was required always to deliver a positive result for SIGN(A, B) if B was zero. |
---|
| 176 | Nevertheless, in \textsc{Fortran}~95, the processor is allowed to do the correct thing and deliver ABS(A) when |
---|
| 177 | B is a positive zero and -ABS(A) when B is a negative zero. |
---|
| 178 | This change in the specification becomes apparent only when B is of type real, and is zero, |
---|
| 179 | and the processor is capable of distinguishing between positive and negative zero, |
---|
| 180 | and B is negative real zero. |
---|
| 181 | Then SIGN delivers a negative result where, under \textsc{Fortran}~90 rules, it used to return a positive result. |
---|
| 182 | This change may be especially sensitive for the ice model, |
---|
| 183 | so we overwrite the intrinsinc function with our own function simply performing : \\ |
---|
[2541] | 184 | \verb? IF( B >= 0.e0 ) THEN ; SIGN(A,B) = ABS(A) ? \\ |
---|
| 185 | \verb? ELSE ; SIGN(A,B) =-ABS(A) ? \\ |
---|
| 186 | \verb? ENDIF ? \\ |
---|
[10368] | 187 | This feature can be found in \mdl{lib\_fortran} module and is effective when \key{nosignedzero} is defined. |
---|
| 188 | We use a CPP key as the overwritting of a intrinsic function can present performance issues with |
---|
| 189 | some computers/compilers. |
---|
[2541] | 190 | |
---|
| 191 | |
---|
| 192 | \subsection{MPP reproducibility} |
---|
[9407] | 193 | \label{subsec:MISC_glosum} |
---|
[2541] | 194 | |
---|
[10368] | 195 | The numerical reproducibility of simulations on distributed memory parallel computers is a critical issue. |
---|
| 196 | In particular, within NEMO global summation of distributed arrays is most susceptible to rounding errors, |
---|
| 197 | and their propagation and accumulation cause uncertainty in final simulation reproducibility on |
---|
| 198 | different numbers of processors. |
---|
| 199 | To avoid so, based on \citet{He_Ding_JSC01} review of different technics, |
---|
| 200 | we use a so called self-compensated summation method. |
---|
| 201 | The idea is to estimate the roundoff error, store it in a buffer, and then add it back in the next addition. |
---|
[2541] | 202 | |
---|
[10368] | 203 | Suppose we need to calculate $b = a_1 + a_2 + a_3$. |
---|
| 204 | The following algorithm will allow to split the sum in two |
---|
| 205 | ($sum_1 = a_{1} + a_{2}$ and $b = sum_2 = sum_1 + a_3$) with exactly the same rounding errors as |
---|
| 206 | the sum performed all at once. |
---|
[2541] | 207 | \begin{align*} |
---|
| 208 | sum_1 \ \ &= a_1 + a_2 \\ |
---|
| 209 | error_1 &= a_2 + ( a_1 - sum_1 ) \\ |
---|
| 210 | sum_2 \ \ &= sum_1 + a_3 + error_1 \\ |
---|
| 211 | error_2 &= a_3 + error_1 + ( sum_1 - sum_2 ) \\ |
---|
| 212 | b \qquad \ &= sum_2 \\ |
---|
| 213 | \end{align*} |
---|
[7646] | 214 | An example of this feature can be found in \mdl{lib\_fortran} module. |
---|
[10368] | 215 | It is systematicallt used in glob\_sum function (summation over the entire basin excluding duplicated rows and |
---|
| 216 | columns due to cyclic or north fold boundary condition as well as overlap MPP areas). |
---|
| 217 | The self-compensated summation method should be used in all summation in i- and/or j-direction. |
---|
| 218 | See \mdl{closea} module for an example. |
---|
[7646] | 219 | Note also that this implementation may be sensitive to the optimization level. |
---|
[2541] | 220 | |
---|
[3294] | 221 | \subsection{MPP scalability} |
---|
[9407] | 222 | \label{subsec:MISC_mppsca} |
---|
[2541] | 223 | |
---|
[10368] | 224 | The default method of communicating values across the north-fold in distributed memory applications (\key{mpp\_mpi}) |
---|
| 225 | uses a \textsc{MPI\_ALLGATHER} function to exchange values from each processing region in |
---|
| 226 | the northern row with every other processing region in the northern row. |
---|
| 227 | This enables a global width array containing the top 4 rows to be collated on every northern row processor and then |
---|
| 228 | folded with a simple algorithm. |
---|
| 229 | Although conceptually simple, this "All to All" communication will hamper performance scalability for |
---|
| 230 | large numbers of northern row processors. |
---|
| 231 | From version 3.4 onwards an alternative method is available which only performs direct "Peer to Peer" communications |
---|
| 232 | between each processor and its immediate "neighbours" across the fold line. |
---|
| 233 | This is achieved by using the default \textsc{MPI\_ALLGATHER} method during initialisation to |
---|
| 234 | help identify the "active" neighbours. |
---|
| 235 | Stored lists of these neighbours are then used in all subsequent north-fold exchanges to |
---|
| 236 | restrict exchanges to those between associated regions. |
---|
| 237 | The collated global width array for each region is thus only partially filled but is guaranteed to |
---|
| 238 | be set at all the locations actually required by each individual for the fold operation. |
---|
| 239 | This alternative method should give identical results to the default \textsc{ALLGATHER} method and |
---|
| 240 | is recommended for large values of \np{jpni}. |
---|
| 241 | The new method is activated by setting \np{ln\_nnogather} to be true ({\bf nammpp}). |
---|
| 242 | The reproducibility of results using the two methods should be confirmed for each new, |
---|
| 243 | non-reference configuration. |
---|
[3294] | 244 | |
---|
[2541] | 245 | % ================================================================ |
---|
[707] | 246 | % Model optimisation, Control Print and Benchmark |
---|
| 247 | % ================================================================ |
---|
[9393] | 248 | \section{Model optimisation, control print and benchmark} |
---|
[9407] | 249 | \label{sec:MISC_opt} |
---|
[1225] | 250 | %--------------------------------------------namctl------------------------------------------------------- |
---|
[10146] | 251 | |
---|
| 252 | \nlst{namctl} |
---|
[707] | 253 | %-------------------------------------------------------------------------------------------------------------- |
---|
| 254 | |
---|
[2541] | 255 | \gmcomment{why not make these bullets into subsections?} |
---|
[4147] | 256 | Options are defined through the \ngn{namctl} namelist variables. |
---|
[707] | 257 | |
---|
[2349] | 258 | $\bullet$ Vector optimisation: |
---|
[707] | 259 | |
---|
[10368] | 260 | \key{vectopt\_loop} enables the internal loops to collapse. |
---|
| 261 | This is very a very efficient way to increase the length of vector calculations and thus |
---|
[2282] | 262 | to speed up the model on vector computers. |
---|
[707] | 263 | |
---|
[994] | 264 | % Add here also one word on NPROMA technique that has been found useless, since compiler have made significant progress during the last decade. |
---|
[707] | 265 | |
---|
[994] | 266 | % Add also one word on NEC specific optimisation (Novercheck option for example) |
---|
[707] | 267 | |
---|
[994] | 268 | $\bullet$ Control print %: describe here 4 things: |
---|
[707] | 269 | |
---|
[10368] | 270 | 1- \np{ln\_ctl}: compute and print the trends averaged over the interior domain in all TRA, DYN, LDF and |
---|
| 271 | ZDF modules. |
---|
| 272 | This option is very helpful when diagnosing the origin of an undesired change in model results. |
---|
[707] | 273 | |
---|
[10368] | 274 | 2- also \np{ln\_ctl} but using the nictl and njctl namelist parameters to check the source of differences between |
---|
| 275 | mono and multi processor runs. |
---|
[707] | 276 | |
---|
[6289] | 277 | %%gm to be removed both here and in the code |
---|
[10368] | 278 | 3- last digit comparison (\np{nn\_bit\_cmp}). |
---|
| 279 | In an MPP simulation, the computation of a sum over the whole domain is performed as the summation over |
---|
| 280 | all processors of each of their sums over their interior domains. |
---|
| 281 | This double sum never gives exactly the same result as a single sum over the whole domain, |
---|
| 282 | due to truncation differences. |
---|
| 283 | The "bit comparison" option has been introduced in order to be able to check that |
---|
| 284 | mono-processor and multi-processor runs give exactly the same results. |
---|
| 285 | % THIS is to be updated with the mpp_sum_glo introduced in v3.3 |
---|
[2376] | 286 | % nn_bit_cmp today only check that the nn_cla = 0 (no cross land advection) |
---|
[6289] | 287 | %%gm end |
---|
[707] | 288 | |
---|
[10368] | 289 | $\bullet$ Benchmark (\np{nn\_bench}). |
---|
| 290 | This option defines a benchmark run based on a GYRE configuration (see \autoref{sec:CFG_gyre}) in which |
---|
| 291 | the resolution remains the same whatever the domain size. |
---|
| 292 | This allows a very large model domain to be used, just by changing the domain size (\jp{jpiglo}, \jp{jpjglo}) and |
---|
| 293 | without adjusting either the time-step or the physical parameterisations. |
---|
[707] | 294 | |
---|
| 295 | % ================================================================ |
---|
[10419] | 296 | \biblio |
---|
| 297 | |
---|
[6997] | 298 | \end{document} |
---|