[707] | 1 | % ================================================================ |
---|
[6289] | 2 | % Chapter ——— Miscellaneous Topics |
---|
[707] | 3 | % ================================================================ |
---|
[2282] | 4 | \chapter{Miscellaneous Topics} |
---|
[707] | 5 | \label{MISC} |
---|
| 6 | \minitoc |
---|
| 7 | |
---|
[2282] | 8 | \newpage |
---|
| 9 | $\ $\newline % force a new ligne |
---|
| 10 | |
---|
[707] | 11 | % ================================================================ |
---|
| 12 | % Representation of Unresolved Straits |
---|
| 13 | % ================================================================ |
---|
| 14 | \section{Representation of Unresolved Straits} |
---|
| 15 | \label{MISC_strait} |
---|
| 16 | |
---|
[2282] | 17 | In climate modeling, it often occurs that a crucial connections between water masses |
---|
| 18 | is broken as the grid mesh is too coarse to resolve narrow straits. For example, coarse |
---|
| 19 | grid spacing typically closes off the Mediterranean from the Atlantic at the Strait of |
---|
| 20 | Gibraltar. In this case, it is important for climate models to include the effects of salty |
---|
| 21 | water entering the Atlantic from the Mediterranean. Likewise, it is important for the |
---|
| 22 | Mediterranean to replenish its supply of water from the Atlantic to balance the net |
---|
| 23 | evaporation occurring over the Mediterranean region. This problem occurs even in |
---|
[2349] | 24 | eddy permitting simulations. For example, in ORCA 1/4\deg several straits of the Indonesian |
---|
[2282] | 25 | archipelago (Ombai, Lombok...) are much narrow than even a single ocean grid-point. |
---|
[707] | 26 | |
---|
[2282] | 27 | We describe briefly here the three methods that can be used in \NEMO to handle |
---|
| 28 | such improperly resolved straits. The first two consist of opening the strait by hand |
---|
| 29 | while ensuring that the mass exchanges through the strait are not too large by |
---|
| 30 | either artificially reducing the surface of the strait grid-cells or, locally increasing |
---|
| 31 | the lateral friction. In the third one, the strait is closed but exchanges of mass, |
---|
| 32 | heat and salt across the land are allowed. |
---|
| 33 | Note that such modifications are so specific to a given configuration that no attempt |
---|
| 34 | has been made to set them in a generic way. However, examples of how |
---|
[4147] | 35 | they can be set up is given in the ORCA 2\deg and 0.5\deg configurations. For example, |
---|
[6289] | 36 | for details of implementation in ORCA2, search: |
---|
| 37 | \texttt{ IF( cp\_cfg == "orca" .AND. jp\_cfg == 2 ) } |
---|
[707] | 38 | |
---|
| 39 | % ------------------------------------------------------------------------------------------------------------- |
---|
| 40 | % Hand made geometry changes |
---|
| 41 | % ------------------------------------------------------------------------------------------------------------- |
---|
| 42 | \subsection{Hand made geometry changes} |
---|
| 43 | \label{MISC_strait_hand} |
---|
| 44 | |
---|
[2282] | 45 | $\bullet$ reduced scale factor in the cross-strait direction to a value in better agreement |
---|
| 46 | with the true mean width of the strait. (Fig.~\ref{Fig_MISC_strait_hand}). |
---|
| 47 | This technique is sometime called "partially open face" or "partially closed cells". |
---|
| 48 | The key issue here is only to reduce the faces of $T$-cell ($i.e.$ change the value |
---|
| 49 | of the horizontal scale factors at $u$- or $v$-point) but not the volume of the $T$-cell. |
---|
| 50 | Indeed, reducing the volume of strait $T$-cell can easily produce a numerical |
---|
| 51 | instability at that grid point that would require a reduction of the model time step. |
---|
| 52 | The changes associated with strait management are done in \mdl{domhgr}, |
---|
| 53 | just after the definition or reading of the horizontal scale factors. |
---|
[707] | 54 | |
---|
[2282] | 55 | $\bullet$ increase of the viscous boundary layer thickness by local increase of the |
---|
| 56 | fmask value at the coast (Fig.~\ref{Fig_MISC_strait_hand}). This is done in |
---|
| 57 | \mdl{dommsk} together with the setting of the coastal value of fmask |
---|
| 58 | (see Section \ref{LBC_coast}) |
---|
[994] | 59 | |
---|
[707] | 60 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
[2376] | 61 | \begin{figure}[!tbp] \begin{center} |
---|
[998] | 62 | \includegraphics[width=0.80\textwidth]{./TexFiles/Figures/Fig_Gibraltar.pdf} |
---|
| 63 | \includegraphics[width=0.80\textwidth]{./TexFiles/Figures/Fig_Gibraltar2.pdf} |
---|
[2376] | 64 | \caption{ \label{Fig_MISC_strait_hand} |
---|
| 65 | Example of the Gibraltar strait defined in a $1\deg \times 1\deg$ mesh. |
---|
[994] | 66 | \textit{Top}: using partially open cells. The meridional scale factor at $v$-point |
---|
| 67 | is reduced on both sides of the strait to account for the real width of the strait |
---|
[1224] | 68 | (about 20 km). Note that the scale factors of the strait $T$-point remains unchanged. |
---|
| 69 | \textit{Bottom}: using viscous boundary layers. The four fmask parameters |
---|
[994] | 70 | along the strait coastlines are set to a value larger than 4, $i.e.$ "strong" no-slip |
---|
| 71 | case (see Fig.\ref{Fig_LBC_shlat}) creating a large viscous boundary layer |
---|
| 72 | that allows a reduced transport through the strait.} |
---|
[707] | 73 | \end{center} \end{figure} |
---|
| 74 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
| 75 | |
---|
| 76 | |
---|
| 77 | % ================================================================ |
---|
| 78 | % Closed seas |
---|
| 79 | % ================================================================ |
---|
[2282] | 80 | \section{Closed seas (\mdl{closea})} |
---|
[707] | 81 | \label{MISC_closea} |
---|
| 82 | |
---|
[2282] | 83 | \colorbox{yellow}{Add here a short description of the way closed seas are managed} |
---|
[707] | 84 | |
---|
[2282] | 85 | |
---|
[707] | 86 | % ================================================================ |
---|
| 87 | % Sub-Domain Functionality (\textit{nizoom, njzoom}, namelist parameters) |
---|
| 88 | % ================================================================ |
---|
[4147] | 89 | \section{Sub-Domain Functionality (\np{jpizoom}, \np{jpjzoom})} |
---|
[707] | 90 | \label{MISC_zoom} |
---|
| 91 | |
---|
[994] | 92 | The sub-domain functionality, also improperly called the zoom option |
---|
| 93 | (improperly because it is not associated with a change in model resolution) |
---|
| 94 | is a quite simple function that allows a simulation over a sub-domain of an |
---|
| 95 | already defined configuration ($i.e.$ without defining a new mesh, initial |
---|
| 96 | state and forcings). This option can be useful for testing the user settings |
---|
| 97 | of surface boundary conditions, or the initial ocean state of a huge ocean |
---|
| 98 | model configuration while having a small computer memory requirement. |
---|
| 99 | It can also be used to easily test specific physics in a sub-domain (for example, |
---|
[2282] | 100 | see \citep{Madec_al_JPO96} for a test of the coupling used in the global ocean |
---|
[994] | 101 | version of OPA between sea-ice and ocean model over the Arctic or Antarctic |
---|
| 102 | ocean, using a sub-domain). In the standard model, this option does not |
---|
| 103 | include any specific treatment for the ocean boundaries of the sub-domain: |
---|
| 104 | they are considered as artificial vertical walls. Nevertheless, it is quite easy |
---|
| 105 | to add a restoring term toward a climatology in the vicinity of such boundaries |
---|
| 106 | (see \S\ref{TRA_dmp}). |
---|
[707] | 107 | |
---|
| 108 | In order to easily define a sub-domain over which the computation can be |
---|
| 109 | performed, the dimension of all input arrays (ocean mesh, bathymetry, |
---|
[4147] | 110 | forcing, initial state, ...) are defined as \np{jpidta}, \np{jpjdta} and \np{jpkdta} |
---|
| 111 | ( in \ngn{namcfg} namelist), while the computational domain is defined through |
---|
| 112 | \np{jpiglo}, \np{jpjglo} and \jp{jpk} (\ngn{namcfg} namelist). When running the |
---|
| 113 | model over the whole domain, the user sets \np{jpiglo}=\np{jpidta} \np{jpjglo}=\np{jpjdta} |
---|
[994] | 114 | and \jp{jpk}=\jp{jpkdta}. When running the model over a sub-domain, the user |
---|
[4147] | 115 | has to provide the size of the sub-domain, (\np{jpiglo}, \np{jpjglo}, \np{jpkglo}), |
---|
| 116 | and the indices of the south western corner as \np{jpizoom} and \np{jpjzoom} in |
---|
| 117 | the \ngn{namcfg} namelist (Fig.~\ref{Fig_LBC_zoom}). |
---|
[707] | 118 | |
---|
[994] | 119 | Note that a third set of dimensions exist, \jp{jpi}, \jp{jpj} and \jp{jpk} which is |
---|
[4147] | 120 | actually used to perform the computation. It is set by default to \jp{jpi}=\np{jpjglo} |
---|
| 121 | and \jp{jpj}=\np{jpjglo}, except for massively parallel computing where the |
---|
[994] | 122 | computational domain is laid out on local processor memories following a 2D |
---|
| 123 | horizontal splitting. % (see {\S}IV.2-c) ref to the section to be updated |
---|
[707] | 124 | |
---|
[5118] | 125 | \subsection{Simple subsetting of input files via netCDF attributes} |
---|
| 126 | |
---|
| 127 | The extended grids for use with the under-shelf ice cavities will result in redundant rows |
---|
| 128 | around Antarctica if the ice cavities are not active. A simple mechanism for subsetting |
---|
| 129 | input files associated with the extended domains has been implemented to avoid the need to |
---|
| 130 | maintain different sets of input fields for use with or without active ice cavities. The |
---|
| 131 | existing 'zoom' options are overly complex for this task and marked for deletion anyway. |
---|
| 132 | This alternative subsetting operates for the j-direction only and works by optionally |
---|
| 133 | looking for and using a global file attribute (named: \np{open\_ocean\_jstart}) to |
---|
| 134 | determine the starting j-row for input. The use of this option is best explained with an |
---|
| 135 | example: Consider an ORCA1 configuration using the extended grid bathymetry and coordinate |
---|
| 136 | files: |
---|
| 137 | \vspace{-10pt} |
---|
| 138 | \begin{alltt} |
---|
| 139 | \tiny |
---|
| 140 | \begin{verbatim} |
---|
| 141 | eORCA1_bathymetry_v2.nc |
---|
| 142 | eORCA1_coordinates.nc |
---|
| 143 | \end{verbatim} |
---|
| 144 | \end{alltt} |
---|
| 145 | \noindent These files define a horizontal domain of 362x332. Assuming the first row with |
---|
| 146 | open ocean wet points in the non-isf bathymetry for this set is row 42 (Fortran indexing) |
---|
| 147 | then the formally correct setting for \np{open\_ocean\_jstart} is 41. Using this value as the |
---|
| 148 | first row to be read will result in a 362x292 domain which is the same size as the original |
---|
| 149 | ORCA1 domain. Thus the extended coordinates and bathymetry files can be used with all the |
---|
| 150 | original input files for ORCA1 if the ice cavities are not active (\np{ln\_isfcav = |
---|
| 151 | .false.}). Full instructions for achieving this are: |
---|
| 152 | |
---|
| 153 | \noindent Add the new attribute to any input files requiring a j-row offset, i.e: |
---|
| 154 | \vspace{-10pt} |
---|
| 155 | \begin{alltt} |
---|
| 156 | \tiny |
---|
| 157 | \begin{verbatim} |
---|
| 158 | ncatted -a open_ocean_jstart,global,a,d,41 eORCA1_coordinates.nc |
---|
| 159 | ncatted -a open_ocean_jstart,global,a,d,41 eORCA1_bathymetry_v2.nc |
---|
| 160 | \end{verbatim} |
---|
| 161 | \end{alltt} |
---|
| 162 | |
---|
| 163 | \noindent Add the logical switch to \ngn{namcfg} in the configuration namelist and set true: |
---|
| 164 | %--------------------------------------------namcfg-------------------------------------------------------- |
---|
| 165 | \namdisplay{namcfg_orca1} |
---|
| 166 | %-------------------------------------------------------------------------------------------------------------- |
---|
| 167 | |
---|
| 168 | \noindent Note the j-size of the global domain is the (extended j-size minus |
---|
| 169 | \np{open\_ocean\_jstart} + 1 ) and this must match the size of all datasets other than |
---|
| 170 | bathymetry and coordinates currently. However the option can be extended to any global, 2D |
---|
| 171 | and 3D, netcdf, input field by adding the: |
---|
| 172 | \vspace{-10pt} |
---|
| 173 | \begin{alltt} |
---|
| 174 | \tiny |
---|
| 175 | \begin{verbatim} |
---|
| 176 | lrowattr=ln_use_jattr |
---|
| 177 | \end{verbatim} |
---|
| 178 | \end{alltt} |
---|
| 179 | optional argument to the appropriate \np{iom\_get} call and the \np{open\_ocean\_jstart} attribute to the corresponding input files. It remains the users responsibility to set \np{jpjdta} and \np{jpjglo} values in the \np{namelist\_cfg} file according to their needs. |
---|
| 180 | |
---|
[707] | 181 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
[2376] | 182 | \begin{figure}[!ht] \begin{center} |
---|
[998] | 183 | \includegraphics[width=0.90\textwidth]{./TexFiles/Figures/Fig_LBC_zoom.pdf} |
---|
[2376] | 184 | \caption{ \label{Fig_LBC_zoom} |
---|
| 185 | Position of a model domain compared to the data input domain when the zoom functionality is used.} |
---|
[707] | 186 | \end{center} \end{figure} |
---|
| 187 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
| 188 | |
---|
| 189 | |
---|
| 190 | % ================================================================ |
---|
[2541] | 191 | % Accuracy and Reproducibility |
---|
| 192 | % ================================================================ |
---|
| 193 | \section{Accuracy and Reproducibility (\mdl{lib\_fortran})} |
---|
| 194 | \label{MISC_fortran} |
---|
| 195 | |
---|
| 196 | \subsection{Issues with intrinsinc SIGN function (\key{nosignedzero})} |
---|
| 197 | \label{MISC_sign} |
---|
| 198 | |
---|
| 199 | The SIGN(A, B) is the \textsc {Fortran} intrinsic function delivers the magnitude |
---|
| 200 | of A with the sign of B. For example, SIGN(-3.0,2.0) has the value 3.0. |
---|
| 201 | The problematic case is when the second argument is zero, because, on platforms |
---|
| 202 | that support IEEE arithmetic, zero is actually a signed number. |
---|
| 203 | There is a positive zero and a negative zero. |
---|
| 204 | |
---|
| 205 | In \textsc{Fortran}~90, the processor was required always to deliver a positive result for SIGN(A, B) |
---|
| 206 | if B was zero. Nevertheless, in \textsc{Fortran}~95, the processor is allowed to do the correct thing |
---|
| 207 | and deliver ABS(A) when B is a positive zero and -ABS(A) when B is a negative zero. |
---|
| 208 | This change in the specification becomes apparent only when B is of type real, and is zero, |
---|
| 209 | and the processor is capable of distinguishing between positive and negative zero, |
---|
| 210 | and B is negative real zero. Then SIGN delivers a negative result where, under \textsc{Fortran}~90 |
---|
| 211 | rules, it used to return a positive result. |
---|
| 212 | This change may be especially sensitive for the ice model, so we overwrite the intrinsinc |
---|
| 213 | function with our own function simply performing : \\ |
---|
| 214 | \verb? IF( B >= 0.e0 ) THEN ; SIGN(A,B) = ABS(A) ? \\ |
---|
| 215 | \verb? ELSE ; SIGN(A,B) =-ABS(A) ? \\ |
---|
| 216 | \verb? ENDIF ? \\ |
---|
| 217 | This feature can be found in \mdl{lib\_fortran} module and is effective when \key{nosignedzero} |
---|
| 218 | is defined. We use a CPP key as the overwritting of a intrinsic function can present |
---|
| 219 | performance issues with some computers/compilers. |
---|
| 220 | |
---|
| 221 | |
---|
| 222 | \subsection{MPP reproducibility} |
---|
| 223 | \label{MISC_glosum} |
---|
| 224 | |
---|
| 225 | The numerical reproducibility of simulations on distributed memory parallel computers |
---|
| 226 | is a critical issue. In particular, within NEMO global summation of distributed arrays |
---|
| 227 | is most susceptible to rounding errors, and their propagation and accumulation cause |
---|
| 228 | uncertainty in final simulation reproducibility on different numbers of processors. |
---|
| 229 | To avoid so, based on \citet{He_Ding_JSC01} review of different technics, |
---|
| 230 | we use a so called self-compensated summation method. The idea is to estimate |
---|
| 231 | the roundoff error, store it in a buffer, and then add it back in the next addition. |
---|
| 232 | |
---|
| 233 | Suppose we need to calculate $b = a_1 + a_2 + a_3$. The following algorithm |
---|
| 234 | will allow to split the sum in two ($sum_1 = a_{1} + a_{2}$ and $b = sum_2 = sum_1 + a_3$) |
---|
| 235 | with exactly the same rounding errors as the sum performed all at once. |
---|
| 236 | \begin{align*} |
---|
| 237 | sum_1 \ \ &= a_1 + a_2 \\ |
---|
| 238 | error_1 &= a_2 + ( a_1 - sum_1 ) \\ |
---|
| 239 | sum_2 \ \ &= sum_1 + a_3 + error_1 \\ |
---|
| 240 | error_2 &= a_3 + error_1 + ( sum_1 - sum_2 ) \\ |
---|
| 241 | b \qquad \ &= sum_2 \\ |
---|
| 242 | \end{align*} |
---|
[6713] | 243 | An example of this feature can be found in \mdl{lib\_fortran} module. |
---|
| 244 | It is systematicallt used in glob\_sum function (summation over the entire basin excluding |
---|
[2541] | 245 | duplicated rows and columns due to cyclic or north fold boundary condition as well as |
---|
[6713] | 246 | overlap MPP areas). The self-compensated summation method should be used in all summation |
---|
| 247 | in i- and/or j-direction. See closea.F90 module for an example. |
---|
| 248 | Note also that this implementation may be sensitive to the optimization level. |
---|
[2541] | 249 | |
---|
[3294] | 250 | \subsection{MPP scalability} |
---|
| 251 | \label{MISC_mppsca} |
---|
[2541] | 252 | |
---|
[3294] | 253 | The default method of communicating values across the north-fold in distributed memory applications |
---|
| 254 | (\key{mpp\_mpi}) uses a \textsc{MPI\_ALLGATHER} function to exchange values from each processing |
---|
| 255 | region in the northern row with every other processing region in the northern row. This enables a |
---|
| 256 | global width array containing the top 4 rows to be collated on every northern row processor and then |
---|
| 257 | folded with a simple algorithm. Although conceptually simple, this "All to All" communication will |
---|
| 258 | hamper performance scalability for large numbers of northern row processors. From version 3.4 |
---|
| 259 | onwards an alternative method is available which only performs direct "Peer to Peer" communications |
---|
| 260 | between each processor and its immediate "neighbours" across the fold line. This is achieved by |
---|
| 261 | using the default \textsc{MPI\_ALLGATHER} method during initialisation to help identify the "active" |
---|
| 262 | neighbours. Stored lists of these neighbours are then used in all subsequent north-fold exchanges to |
---|
| 263 | restrict exchanges to those between associated regions. The collated global width array for each |
---|
| 264 | region is thus only partially filled but is guaranteed to be set at all the locations actually |
---|
| 265 | required by each individual for the fold operation. This alternative method should give identical |
---|
| 266 | results to the default \textsc{ALLGATHER} method and is recommended for large values of \np{jpni}. |
---|
| 267 | The new method is activated by setting \np{ln\_nnogather} to be true ({\bf nammpp}). The |
---|
| 268 | reproducibility of results using the two methods should be confirmed for each new, non-reference |
---|
| 269 | configuration. |
---|
| 270 | |
---|
[2541] | 271 | % ================================================================ |
---|
[707] | 272 | % Model optimisation, Control Print and Benchmark |
---|
| 273 | % ================================================================ |
---|
[994] | 274 | \section{Model Optimisation, Control Print and Benchmark} |
---|
[707] | 275 | \label{MISC_opt} |
---|
[1225] | 276 | %--------------------------------------------namctl------------------------------------------------------- |
---|
| 277 | \namdisplay{namctl} |
---|
[707] | 278 | %-------------------------------------------------------------------------------------------------------------- |
---|
| 279 | |
---|
[2541] | 280 | \gmcomment{why not make these bullets into subsections?} |
---|
[4147] | 281 | Options are defined through the \ngn{namctl} namelist variables. |
---|
[707] | 282 | |
---|
[2349] | 283 | $\bullet$ Vector optimisation: |
---|
[707] | 284 | |
---|
[2282] | 285 | \key{vectopt\_loop} enables the internal loops to collapse. This is very |
---|
| 286 | a very efficient way to increase the length of vector calculations and thus |
---|
| 287 | to speed up the model on vector computers. |
---|
[707] | 288 | |
---|
[994] | 289 | % Add here also one word on NPROMA technique that has been found useless, since compiler have made significant progress during the last decade. |
---|
[707] | 290 | |
---|
[994] | 291 | % Add also one word on NEC specific optimisation (Novercheck option for example) |
---|
[707] | 292 | |
---|
[994] | 293 | $\bullet$ Control print %: describe here 4 things: |
---|
[707] | 294 | |
---|
[994] | 295 | 1- \np{ln\_ctl} : compute and print the trends averaged over the interior domain |
---|
| 296 | in all TRA, DYN, LDF and ZDF modules. This option is very helpful when |
---|
| 297 | diagnosing the origin of an undesired change in model results. |
---|
[707] | 298 | |
---|
[994] | 299 | 2- also \np{ln\_ctl} but using the nictl and njctl namelist parameters to check |
---|
| 300 | the source of differences between mono and multi processor runs. |
---|
[707] | 301 | |
---|
[6289] | 302 | %%gm to be removed both here and in the code |
---|
| 303 | 3- last digit comparison (\np{nn\_bit\_cmp}). In an MPP simulation, the computation of |
---|
[994] | 304 | a sum over the whole domain is performed as the summation over all processors of |
---|
| 305 | each of their sums over their interior domains. This double sum never gives exactly |
---|
| 306 | the same result as a single sum over the whole domain, due to truncation differences. |
---|
| 307 | The "bit comparison" option has been introduced in order to be able to check that |
---|
| 308 | mono-processor and multi-processor runs give exactly the same results. |
---|
[2376] | 309 | %THIS is to be updated with the mpp_sum_glo introduced in v3.3 |
---|
| 310 | % nn_bit_cmp today only check that the nn_cla = 0 (no cross land advection) |
---|
[6289] | 311 | %%gm end |
---|
[707] | 312 | |
---|
[2282] | 313 | $\bullet$ Benchmark (\np{nn\_bench}). This option defines a benchmark run based on |
---|
[2376] | 314 | a GYRE configuration (see \S\ref{CFG_gyre}) in which the resolution remains the same |
---|
| 315 | whatever the domain size. This allows a very large model domain to be used, just by |
---|
| 316 | changing the domain size (\jp{jpiglo}, \jp{jpjglo}) and without adjusting either the time-step |
---|
| 317 | or the physical parameterisations. |
---|
[707] | 318 | |
---|
| 319 | % ================================================================ |
---|
| 320 | |
---|
[2282] | 321 | |
---|
[707] | 322 | |
---|
| 323 | |
---|
[2282] | 324 | |
---|