[707] | 1 | % ================================================================ |
---|
[4147] | 2 | % Chapter � Miscellaneous Topics |
---|
[707] | 3 | % ================================================================ |
---|
[2282] | 4 | \chapter{Miscellaneous Topics} |
---|
[707] | 5 | \label{MISC} |
---|
| 6 | \minitoc |
---|
| 7 | |
---|
[2282] | 8 | \newpage |
---|
| 9 | $\ $\newline % force a new ligne |
---|
| 10 | |
---|
[707] | 11 | % ================================================================ |
---|
| 12 | % Representation of Unresolved Straits |
---|
| 13 | % ================================================================ |
---|
| 14 | \section{Representation of Unresolved Straits} |
---|
| 15 | \label{MISC_strait} |
---|
| 16 | |
---|
[2282] | 17 | In climate modeling, it often occurs that a crucial connections between water masses |
---|
| 18 | is broken as the grid mesh is too coarse to resolve narrow straits. For example, coarse |
---|
| 19 | grid spacing typically closes off the Mediterranean from the Atlantic at the Strait of |
---|
| 20 | Gibraltar. In this case, it is important for climate models to include the effects of salty |
---|
| 21 | water entering the Atlantic from the Mediterranean. Likewise, it is important for the |
---|
| 22 | Mediterranean to replenish its supply of water from the Atlantic to balance the net |
---|
| 23 | evaporation occurring over the Mediterranean region. This problem occurs even in |
---|
[2349] | 24 | eddy permitting simulations. For example, in ORCA 1/4\deg several straits of the Indonesian |
---|
[2282] | 25 | archipelago (Ombai, Lombok...) are much narrow than even a single ocean grid-point. |
---|
[707] | 26 | |
---|
[2282] | 27 | We describe briefly here the three methods that can be used in \NEMO to handle |
---|
| 28 | such improperly resolved straits. The first two consist of opening the strait by hand |
---|
| 29 | while ensuring that the mass exchanges through the strait are not too large by |
---|
| 30 | either artificially reducing the surface of the strait grid-cells or, locally increasing |
---|
| 31 | the lateral friction. In the third one, the strait is closed but exchanges of mass, |
---|
| 32 | heat and salt across the land are allowed. |
---|
| 33 | Note that such modifications are so specific to a given configuration that no attempt |
---|
| 34 | has been made to set them in a generic way. However, examples of how |
---|
[4147] | 35 | they can be set up is given in the ORCA 2\deg and 0.5\deg configurations. For example, |
---|
[6436] | 36 | for details of implementation in ORCA2, search: |
---|
| 37 | \texttt{ IF( cp\_cfg == "orca" .AND. jp\_cfg == 2 ) } |
---|
[707] | 38 | |
---|
| 39 | % ------------------------------------------------------------------------------------------------------------- |
---|
| 40 | % Hand made geometry changes |
---|
| 41 | % ------------------------------------------------------------------------------------------------------------- |
---|
| 42 | \subsection{Hand made geometry changes} |
---|
| 43 | \label{MISC_strait_hand} |
---|
| 44 | |
---|
[2282] | 45 | $\bullet$ reduced scale factor in the cross-strait direction to a value in better agreement |
---|
| 46 | with the true mean width of the strait. (Fig.~\ref{Fig_MISC_strait_hand}). |
---|
| 47 | This technique is sometime called "partially open face" or "partially closed cells". |
---|
| 48 | The key issue here is only to reduce the faces of $T$-cell ($i.e.$ change the value |
---|
| 49 | of the horizontal scale factors at $u$- or $v$-point) but not the volume of the $T$-cell. |
---|
| 50 | Indeed, reducing the volume of strait $T$-cell can easily produce a numerical |
---|
| 51 | instability at that grid point that would require a reduction of the model time step. |
---|
| 52 | The changes associated with strait management are done in \mdl{domhgr}, |
---|
| 53 | just after the definition or reading of the horizontal scale factors. |
---|
[707] | 54 | |
---|
[2282] | 55 | $\bullet$ increase of the viscous boundary layer thickness by local increase of the |
---|
| 56 | fmask value at the coast (Fig.~\ref{Fig_MISC_strait_hand}). This is done in |
---|
| 57 | \mdl{dommsk} together with the setting of the coastal value of fmask |
---|
| 58 | (see Section \ref{LBC_coast}) |
---|
[994] | 59 | |
---|
[707] | 60 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
[2376] | 61 | \begin{figure}[!tbp] \begin{center} |
---|
[998] | 62 | \includegraphics[width=0.80\textwidth]{./TexFiles/Figures/Fig_Gibraltar.pdf} |
---|
| 63 | \includegraphics[width=0.80\textwidth]{./TexFiles/Figures/Fig_Gibraltar2.pdf} |
---|
[2376] | 64 | \caption{ \label{Fig_MISC_strait_hand} |
---|
| 65 | Example of the Gibraltar strait defined in a $1\deg \times 1\deg$ mesh. |
---|
[994] | 66 | \textit{Top}: using partially open cells. The meridional scale factor at $v$-point |
---|
| 67 | is reduced on both sides of the strait to account for the real width of the strait |
---|
[1224] | 68 | (about 20 km). Note that the scale factors of the strait $T$-point remains unchanged. |
---|
| 69 | \textit{Bottom}: using viscous boundary layers. The four fmask parameters |
---|
[994] | 70 | along the strait coastlines are set to a value larger than 4, $i.e.$ "strong" no-slip |
---|
| 71 | case (see Fig.\ref{Fig_LBC_shlat}) creating a large viscous boundary layer |
---|
| 72 | that allows a reduced transport through the strait.} |
---|
[707] | 73 | \end{center} \end{figure} |
---|
| 74 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
| 75 | |
---|
| 76 | % ------------------------------------------------------------------------------------------------------------- |
---|
| 77 | % Cross Land Advection |
---|
| 78 | % ------------------------------------------------------------------------------------------------------------- |
---|
[2282] | 79 | \subsection{Cross Land Advection (\mdl{tracla})} |
---|
[707] | 80 | \label{MISC_strait_cla} |
---|
[1225] | 81 | %--------------------------------------------namcla-------------------------------------------------------- |
---|
| 82 | \namdisplay{namcla} |
---|
[707] | 83 | %-------------------------------------------------------------------------------------------------------------- |
---|
| 84 | |
---|
[4147] | 85 | Options are defined through the \ngn{namcla} namelist variables. |
---|
[6436] | 86 | This option is an obsolescent feature that will be removed in version 3.7 and followings. |
---|
[707] | 87 | |
---|
[2282] | 88 | %The problem is resolved here by allowing the mixing of tracers and mass/volume between non-adjacent water columns at nominated regions within the model. Momentum is not mixed. The scheme conserves total tracer content, and total volume (the latter in $z*$- or $s*$-coordinate), and maintains compatibility between the tracer and mass/volume budgets. |
---|
| 89 | |
---|
[707] | 90 | % ================================================================ |
---|
| 91 | % Closed seas |
---|
| 92 | % ================================================================ |
---|
[2282] | 93 | \section{Closed seas (\mdl{closea})} |
---|
[707] | 94 | \label{MISC_closea} |
---|
| 95 | |
---|
[2282] | 96 | \colorbox{yellow}{Add here a short description of the way closed seas are managed} |
---|
[707] | 97 | |
---|
[2282] | 98 | |
---|
[707] | 99 | % ================================================================ |
---|
| 100 | % Sub-Domain Functionality (\textit{nizoom, njzoom}, namelist parameters) |
---|
| 101 | % ================================================================ |
---|
[4147] | 102 | \section{Sub-Domain Functionality (\np{jpizoom}, \np{jpjzoom})} |
---|
[707] | 103 | \label{MISC_zoom} |
---|
| 104 | |
---|
[994] | 105 | The sub-domain functionality, also improperly called the zoom option |
---|
| 106 | (improperly because it is not associated with a change in model resolution) |
---|
| 107 | is a quite simple function that allows a simulation over a sub-domain of an |
---|
| 108 | already defined configuration ($i.e.$ without defining a new mesh, initial |
---|
| 109 | state and forcings). This option can be useful for testing the user settings |
---|
| 110 | of surface boundary conditions, or the initial ocean state of a huge ocean |
---|
| 111 | model configuration while having a small computer memory requirement. |
---|
| 112 | It can also be used to easily test specific physics in a sub-domain (for example, |
---|
[2282] | 113 | see \citep{Madec_al_JPO96} for a test of the coupling used in the global ocean |
---|
[994] | 114 | version of OPA between sea-ice and ocean model over the Arctic or Antarctic |
---|
| 115 | ocean, using a sub-domain). In the standard model, this option does not |
---|
| 116 | include any specific treatment for the ocean boundaries of the sub-domain: |
---|
| 117 | they are considered as artificial vertical walls. Nevertheless, it is quite easy |
---|
| 118 | to add a restoring term toward a climatology in the vicinity of such boundaries |
---|
| 119 | (see \S\ref{TRA_dmp}). |
---|
[707] | 120 | |
---|
| 121 | In order to easily define a sub-domain over which the computation can be |
---|
| 122 | performed, the dimension of all input arrays (ocean mesh, bathymetry, |
---|
[4147] | 123 | forcing, initial state, ...) are defined as \np{jpidta}, \np{jpjdta} and \np{jpkdta} |
---|
| 124 | ( in \ngn{namcfg} namelist), while the computational domain is defined through |
---|
| 125 | \np{jpiglo}, \np{jpjglo} and \jp{jpk} (\ngn{namcfg} namelist). When running the |
---|
| 126 | model over the whole domain, the user sets \np{jpiglo}=\np{jpidta} \np{jpjglo}=\np{jpjdta} |
---|
[994] | 127 | and \jp{jpk}=\jp{jpkdta}. When running the model over a sub-domain, the user |
---|
[4147] | 128 | has to provide the size of the sub-domain, (\np{jpiglo}, \np{jpjglo}, \np{jpkglo}), |
---|
| 129 | and the indices of the south western corner as \np{jpizoom} and \np{jpjzoom} in |
---|
| 130 | the \ngn{namcfg} namelist (Fig.~\ref{Fig_LBC_zoom}). |
---|
[707] | 131 | |
---|
[994] | 132 | Note that a third set of dimensions exist, \jp{jpi}, \jp{jpj} and \jp{jpk} which is |
---|
[4147] | 133 | actually used to perform the computation. It is set by default to \jp{jpi}=\np{jpjglo} |
---|
| 134 | and \jp{jpj}=\np{jpjglo}, except for massively parallel computing where the |
---|
[994] | 135 | computational domain is laid out on local processor memories following a 2D |
---|
| 136 | horizontal splitting. % (see {\S}IV.2-c) ref to the section to be updated |
---|
[707] | 137 | |
---|
[5118] | 138 | \subsection{Simple subsetting of input files via netCDF attributes} |
---|
| 139 | |
---|
| 140 | The extended grids for use with the under-shelf ice cavities will result in redundant rows |
---|
| 141 | around Antarctica if the ice cavities are not active. A simple mechanism for subsetting |
---|
| 142 | input files associated with the extended domains has been implemented to avoid the need to |
---|
| 143 | maintain different sets of input fields for use with or without active ice cavities. The |
---|
| 144 | existing 'zoom' options are overly complex for this task and marked for deletion anyway. |
---|
| 145 | This alternative subsetting operates for the j-direction only and works by optionally |
---|
| 146 | looking for and using a global file attribute (named: \np{open\_ocean\_jstart}) to |
---|
| 147 | determine the starting j-row for input. The use of this option is best explained with an |
---|
| 148 | example: Consider an ORCA1 configuration using the extended grid bathymetry and coordinate |
---|
| 149 | files: |
---|
| 150 | \vspace{-10pt} |
---|
| 151 | \begin{alltt} |
---|
| 152 | \tiny |
---|
| 153 | \begin{verbatim} |
---|
| 154 | eORCA1_bathymetry_v2.nc |
---|
| 155 | eORCA1_coordinates.nc |
---|
| 156 | \end{verbatim} |
---|
| 157 | \end{alltt} |
---|
| 158 | \noindent These files define a horizontal domain of 362x332. Assuming the first row with |
---|
| 159 | open ocean wet points in the non-isf bathymetry for this set is row 42 (Fortran indexing) |
---|
| 160 | then the formally correct setting for \np{open\_ocean\_jstart} is 41. Using this value as the |
---|
| 161 | first row to be read will result in a 362x292 domain which is the same size as the original |
---|
| 162 | ORCA1 domain. Thus the extended coordinates and bathymetry files can be used with all the |
---|
| 163 | original input files for ORCA1 if the ice cavities are not active (\np{ln\_isfcav = |
---|
| 164 | .false.}). Full instructions for achieving this are: |
---|
| 165 | |
---|
| 166 | \noindent Add the new attribute to any input files requiring a j-row offset, i.e: |
---|
| 167 | \vspace{-10pt} |
---|
| 168 | \begin{alltt} |
---|
| 169 | \tiny |
---|
| 170 | \begin{verbatim} |
---|
| 171 | ncatted -a open_ocean_jstart,global,a,d,41 eORCA1_coordinates.nc |
---|
| 172 | ncatted -a open_ocean_jstart,global,a,d,41 eORCA1_bathymetry_v2.nc |
---|
| 173 | \end{verbatim} |
---|
| 174 | \end{alltt} |
---|
| 175 | |
---|
| 176 | \noindent Add the logical switch to \ngn{namcfg} in the configuration namelist and set true: |
---|
| 177 | %--------------------------------------------namcfg-------------------------------------------------------- |
---|
| 178 | \namdisplay{namcfg_orca1} |
---|
| 179 | %-------------------------------------------------------------------------------------------------------------- |
---|
| 180 | |
---|
| 181 | \noindent Note the j-size of the global domain is the (extended j-size minus |
---|
| 182 | \np{open\_ocean\_jstart} + 1 ) and this must match the size of all datasets other than |
---|
| 183 | bathymetry and coordinates currently. However the option can be extended to any global, 2D |
---|
| 184 | and 3D, netcdf, input field by adding the: |
---|
| 185 | \vspace{-10pt} |
---|
| 186 | \begin{alltt} |
---|
| 187 | \tiny |
---|
| 188 | \begin{verbatim} |
---|
| 189 | lrowattr=ln_use_jattr |
---|
| 190 | \end{verbatim} |
---|
| 191 | \end{alltt} |
---|
| 192 | optional argument to the appropriate \np{iom\_get} call and the \np{open\_ocean\_jstart} attribute to the corresponding input files. It remains the users responsibility to set \np{jpjdta} and \np{jpjglo} values in the \np{namelist\_cfg} file according to their needs. |
---|
| 193 | |
---|
[707] | 194 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
[2376] | 195 | \begin{figure}[!ht] \begin{center} |
---|
[998] | 196 | \includegraphics[width=0.90\textwidth]{./TexFiles/Figures/Fig_LBC_zoom.pdf} |
---|
[2376] | 197 | \caption{ \label{Fig_LBC_zoom} |
---|
| 198 | Position of a model domain compared to the data input domain when the zoom functionality is used.} |
---|
[707] | 199 | \end{center} \end{figure} |
---|
| 200 | %>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
---|
| 201 | |
---|
| 202 | |
---|
| 203 | % ================================================================ |
---|
| 204 | % Accelerating the Convergence |
---|
| 205 | % ================================================================ |
---|
| 206 | \section{Accelerating the Convergence (\np{nn\_acc} = 1)} |
---|
| 207 | \label{MISC_acc} |
---|
| 208 | %--------------------------------------------namdom------------------------------------------------------- |
---|
| 209 | \namdisplay{namdom} |
---|
| 210 | %-------------------------------------------------------------------------------------------------------------- |
---|
| 211 | |
---|
[2282] | 212 | Searching an equilibrium state with an global ocean model requires a very long time |
---|
| 213 | integration period (a few thousand years for a global model). Due to the size of |
---|
| 214 | the time step required for numerical stability (less than a few hours), |
---|
| 215 | this usually requires a large elapsed time. In order to overcome this problem, |
---|
| 216 | \citet{Bryan1984} introduces a technique that is intended to accelerate |
---|
[994] | 217 | the spin up to equilibrium. It uses a larger time step in |
---|
[2282] | 218 | the tracer evolution equations than in the momentum evolution |
---|
[707] | 219 | equations. It does not affect the equilibrium solution but modifies the |
---|
| 220 | trajectory to reach it. |
---|
| 221 | |
---|
[4147] | 222 | Options are defined through the \ngn{namdom} namelist variables. |
---|
[994] | 223 | The acceleration of convergence option is used when \np{nn\_acc}=1. In that case, |
---|
[2282] | 224 | $\rdt=rn\_rdt$ is the time step of dynamics while $\widetilde{\rdt}=rdttra$ is the |
---|
| 225 | tracer time-step. the former is set from the \np{rn\_rdt} namelist parameter while the latter |
---|
| 226 | is computed using a hyperbolic tangent profile and the following namelist parameters : |
---|
| 227 | \np{rn\_rdtmin}, \np{rn\_rdtmax} and \np{rn\_rdth}. Those three parameters correspond |
---|
| 228 | to the surface value the deep ocean value and the depth at which the transition occurs, respectively. |
---|
| 229 | The set of prognostic equations to solve becomes: |
---|
[707] | 230 | \begin{equation} \label{Eq_acc} |
---|
| 231 | \begin{split} |
---|
| 232 | \frac{\partial \textbf{U}_h }{\partial t} |
---|
[2282] | 233 | &\equiv \frac{\textbf{U}_h ^{t+1}-\textbf{U}_h^{t-1} }{2\rdt} = \ldots \\ |
---|
| 234 | \frac{\partial T}{\partial t} &\equiv \frac{T^{t+1}-T^{t-1}}{2 \widetilde{\rdt}} = \ldots \\ |
---|
| 235 | \frac{\partial S}{\partial t} &\equiv \frac{S^{t+1} -S^{t-1}}{2 \widetilde{\rdt}} = \ldots \\ |
---|
[707] | 236 | \end{split} |
---|
| 237 | \end{equation} |
---|
| 238 | |
---|
[994] | 239 | \citet{Bryan1984} has examined the consequences of this distorted physics. |
---|
| 240 | Free waves have a slower phase speed, their meridional structure is slightly |
---|
[707] | 241 | modified, and the growth rate of baroclinically unstable waves is reduced |
---|
[994] | 242 | but with a wider range of instability. This technique is efficient for |
---|
| 243 | searching for an equilibrium state in coarse resolution models. However its |
---|
[707] | 244 | application is not suitable for many oceanic problems: it cannot be used for |
---|
| 245 | transient or time evolving problems (in particular, it is very questionable |
---|
[994] | 246 | to use this technique when there is a seasonal cycle in the forcing fields), |
---|
[707] | 247 | and it cannot be used in high-resolution models where baroclinically |
---|
| 248 | unstable processes are important. Moreover, the vertical variation of |
---|
[2282] | 249 | $\widetilde{ \rdt}$ implies that the heat and salt contents are no longer |
---|
[707] | 250 | conserved due to the vertical coupling of the ocean level through both |
---|
[2282] | 251 | advection and diffusion. Therefore \np{rn\_rdtmin} = \np{rn\_rdtmax} should be |
---|
| 252 | a more clever choice. |
---|
[707] | 253 | |
---|
[2541] | 254 | |
---|
[707] | 255 | % ================================================================ |
---|
[2541] | 256 | % Accuracy and Reproducibility |
---|
| 257 | % ================================================================ |
---|
| 258 | \section{Accuracy and Reproducibility (\mdl{lib\_fortran})} |
---|
| 259 | \label{MISC_fortran} |
---|
| 260 | |
---|
| 261 | \subsection{Issues with intrinsinc SIGN function (\key{nosignedzero})} |
---|
| 262 | \label{MISC_sign} |
---|
| 263 | |
---|
| 264 | The SIGN(A, B) is the \textsc {Fortran} intrinsic function delivers the magnitude |
---|
| 265 | of A with the sign of B. For example, SIGN(-3.0,2.0) has the value 3.0. |
---|
| 266 | The problematic case is when the second argument is zero, because, on platforms |
---|
| 267 | that support IEEE arithmetic, zero is actually a signed number. |
---|
| 268 | There is a positive zero and a negative zero. |
---|
| 269 | |
---|
| 270 | In \textsc{Fortran}~90, the processor was required always to deliver a positive result for SIGN(A, B) |
---|
| 271 | if B was zero. Nevertheless, in \textsc{Fortran}~95, the processor is allowed to do the correct thing |
---|
| 272 | and deliver ABS(A) when B is a positive zero and -ABS(A) when B is a negative zero. |
---|
| 273 | This change in the specification becomes apparent only when B is of type real, and is zero, |
---|
| 274 | and the processor is capable of distinguishing between positive and negative zero, |
---|
| 275 | and B is negative real zero. Then SIGN delivers a negative result where, under \textsc{Fortran}~90 |
---|
| 276 | rules, it used to return a positive result. |
---|
| 277 | This change may be especially sensitive for the ice model, so we overwrite the intrinsinc |
---|
| 278 | function with our own function simply performing : \\ |
---|
| 279 | \verb? IF( B >= 0.e0 ) THEN ; SIGN(A,B) = ABS(A) ? \\ |
---|
| 280 | \verb? ELSE ; SIGN(A,B) =-ABS(A) ? \\ |
---|
| 281 | \verb? ENDIF ? \\ |
---|
| 282 | This feature can be found in \mdl{lib\_fortran} module and is effective when \key{nosignedzero} |
---|
| 283 | is defined. We use a CPP key as the overwritting of a intrinsic function can present |
---|
| 284 | performance issues with some computers/compilers. |
---|
| 285 | |
---|
| 286 | |
---|
| 287 | \subsection{MPP reproducibility} |
---|
| 288 | \label{MISC_glosum} |
---|
| 289 | |
---|
| 290 | The numerical reproducibility of simulations on distributed memory parallel computers |
---|
| 291 | is a critical issue. In particular, within NEMO global summation of distributed arrays |
---|
| 292 | is most susceptible to rounding errors, and their propagation and accumulation cause |
---|
| 293 | uncertainty in final simulation reproducibility on different numbers of processors. |
---|
| 294 | To avoid so, based on \citet{He_Ding_JSC01} review of different technics, |
---|
| 295 | we use a so called self-compensated summation method. The idea is to estimate |
---|
| 296 | the roundoff error, store it in a buffer, and then add it back in the next addition. |
---|
| 297 | |
---|
| 298 | Suppose we need to calculate $b = a_1 + a_2 + a_3$. The following algorithm |
---|
| 299 | will allow to split the sum in two ($sum_1 = a_{1} + a_{2}$ and $b = sum_2 = sum_1 + a_3$) |
---|
| 300 | with exactly the same rounding errors as the sum performed all at once. |
---|
| 301 | \begin{align*} |
---|
| 302 | sum_1 \ \ &= a_1 + a_2 \\ |
---|
| 303 | error_1 &= a_2 + ( a_1 - sum_1 ) \\ |
---|
| 304 | sum_2 \ \ &= sum_1 + a_3 + error_1 \\ |
---|
| 305 | error_2 &= a_3 + error_1 + ( sum_1 - sum_2 ) \\ |
---|
| 306 | b \qquad \ &= sum_2 \\ |
---|
| 307 | \end{align*} |
---|
| 308 | This feature can be found in \mdl{lib\_fortran} module and is effective when \key{mpp\_rep}. |
---|
| 309 | In that case, all calls to glob\_sum function (summation over the entire basin excluding |
---|
| 310 | duplicated rows and columns due to cyclic or north fold boundary condition as well as |
---|
| 311 | overlap MPP areas). |
---|
| 312 | Note this implementation may be sensitive to the optimization level. |
---|
| 313 | |
---|
[3294] | 314 | \subsection{MPP scalability} |
---|
| 315 | \label{MISC_mppsca} |
---|
[2541] | 316 | |
---|
[3294] | 317 | The default method of communicating values across the north-fold in distributed memory applications |
---|
| 318 | (\key{mpp\_mpi}) uses a \textsc{MPI\_ALLGATHER} function to exchange values from each processing |
---|
| 319 | region in the northern row with every other processing region in the northern row. This enables a |
---|
| 320 | global width array containing the top 4 rows to be collated on every northern row processor and then |
---|
| 321 | folded with a simple algorithm. Although conceptually simple, this "All to All" communication will |
---|
| 322 | hamper performance scalability for large numbers of northern row processors. From version 3.4 |
---|
| 323 | onwards an alternative method is available which only performs direct "Peer to Peer" communications |
---|
| 324 | between each processor and its immediate "neighbours" across the fold line. This is achieved by |
---|
| 325 | using the default \textsc{MPI\_ALLGATHER} method during initialisation to help identify the "active" |
---|
| 326 | neighbours. Stored lists of these neighbours are then used in all subsequent north-fold exchanges to |
---|
| 327 | restrict exchanges to those between associated regions. The collated global width array for each |
---|
| 328 | region is thus only partially filled but is guaranteed to be set at all the locations actually |
---|
| 329 | required by each individual for the fold operation. This alternative method should give identical |
---|
| 330 | results to the default \textsc{ALLGATHER} method and is recommended for large values of \np{jpni}. |
---|
| 331 | The new method is activated by setting \np{ln\_nnogather} to be true ({\bf nammpp}). The |
---|
| 332 | reproducibility of results using the two methods should be confirmed for each new, non-reference |
---|
| 333 | configuration. |
---|
| 334 | |
---|
[2541] | 335 | % ================================================================ |
---|
[707] | 336 | % Model optimisation, Control Print and Benchmark |
---|
| 337 | % ================================================================ |
---|
[994] | 338 | \section{Model Optimisation, Control Print and Benchmark} |
---|
[707] | 339 | \label{MISC_opt} |
---|
[1225] | 340 | %--------------------------------------------namctl------------------------------------------------------- |
---|
| 341 | \namdisplay{namctl} |
---|
[707] | 342 | %-------------------------------------------------------------------------------------------------------------- |
---|
| 343 | |
---|
[2541] | 344 | \gmcomment{why not make these bullets into subsections?} |
---|
[4147] | 345 | Options are defined through the \ngn{namctl} namelist variables. |
---|
[707] | 346 | |
---|
[2349] | 347 | $\bullet$ Vector optimisation: |
---|
[707] | 348 | |
---|
[2282] | 349 | \key{vectopt\_loop} enables the internal loops to collapse. This is very |
---|
| 350 | a very efficient way to increase the length of vector calculations and thus |
---|
| 351 | to speed up the model on vector computers. |
---|
[707] | 352 | |
---|
[994] | 353 | % Add here also one word on NPROMA technique that has been found useless, since compiler have made significant progress during the last decade. |
---|
[707] | 354 | |
---|
[994] | 355 | % Add also one word on NEC specific optimisation (Novercheck option for example) |
---|
[707] | 356 | |
---|
[994] | 357 | $\bullet$ Control print %: describe here 4 things: |
---|
[707] | 358 | |
---|
[994] | 359 | 1- \np{ln\_ctl} : compute and print the trends averaged over the interior domain |
---|
| 360 | in all TRA, DYN, LDF and ZDF modules. This option is very helpful when |
---|
| 361 | diagnosing the origin of an undesired change in model results. |
---|
[707] | 362 | |
---|
[994] | 363 | 2- also \np{ln\_ctl} but using the nictl and njctl namelist parameters to check |
---|
| 364 | the source of differences between mono and multi processor runs. |
---|
[707] | 365 | |
---|
[994] | 366 | 3- \key{esopa} (to be rename key\_nemo) : which is another option for model |
---|
| 367 | management. When defined, this key forces the activation of all options and |
---|
| 368 | CPP keys. For example, all tracer and momentum advection schemes are called! |
---|
[2282] | 369 | Therefore the model results have no physical meaning. |
---|
[994] | 370 | However, this option forces both the compiler and the model to run through |
---|
| 371 | all the \textsc{Fortran} lines of the model. This allows the user to check for obvious |
---|
| 372 | compilation or execution errors with all CPP options, and errors in namelist options. |
---|
[707] | 373 | |
---|
[2282] | 374 | 4- last digit comparison (\np{nn\_bit\_cmp}). In an MPP simulation, the computation of |
---|
[994] | 375 | a sum over the whole domain is performed as the summation over all processors of |
---|
| 376 | each of their sums over their interior domains. This double sum never gives exactly |
---|
| 377 | the same result as a single sum over the whole domain, due to truncation differences. |
---|
| 378 | The "bit comparison" option has been introduced in order to be able to check that |
---|
| 379 | mono-processor and multi-processor runs give exactly the same results. |
---|
[2376] | 380 | %THIS is to be updated with the mpp_sum_glo introduced in v3.3 |
---|
| 381 | % nn_bit_cmp today only check that the nn_cla = 0 (no cross land advection) |
---|
[707] | 382 | |
---|
[2282] | 383 | $\bullet$ Benchmark (\np{nn\_bench}). This option defines a benchmark run based on |
---|
[2376] | 384 | a GYRE configuration (see \S\ref{CFG_gyre}) in which the resolution remains the same |
---|
| 385 | whatever the domain size. This allows a very large model domain to be used, just by |
---|
| 386 | changing the domain size (\jp{jpiglo}, \jp{jpjglo}) and without adjusting either the time-step |
---|
| 387 | or the physical parameterisations. |
---|
[707] | 388 | |
---|
| 389 | |
---|
| 390 | % ================================================================ |
---|
| 391 | % Elliptic solvers (SOL) |
---|
| 392 | % ================================================================ |
---|
[2282] | 393 | \section{Elliptic solvers (SOL)} |
---|
[707] | 394 | \label{MISC_sol} |
---|
| 395 | %--------------------------------------------namdom------------------------------------------------------- |
---|
| 396 | \namdisplay{namsol} |
---|
| 397 | %-------------------------------------------------------------------------------------------------------------- |
---|
| 398 | |
---|
[2282] | 399 | When the filtered sea surface height option is used, the surface pressure gradient is |
---|
| 400 | computed in \mdl{dynspg\_flt}. The force added in the momentum equation is solved implicitely. |
---|
| 401 | It is thus solution of an elliptic equation \eqref{Eq_PE_flt} for which two solvers are available: |
---|
| 402 | a Successive-Over-Relaxation scheme (SOR) and a preconditioned conjugate gradient |
---|
| 403 | scheme(PCG) \citep{Madec_al_OM88, Madec_PhD90}. The solver is selected trough the |
---|
[4147] | 404 | the value of \np{nn\_solv} \ngn{namsol} namelist variable. |
---|
[2282] | 405 | |
---|
[994] | 406 | The PCG is a very efficient method for solving elliptic equations on vector computers. |
---|
| 407 | It is a fast and rather easy method to use; which are attractive features for a large |
---|
| 408 | number of ocean situations (variable bottom topography, complex coastal geometry, |
---|
[2541] | 409 | variable grid spacing, open or cyclic boundaries, etc ...). It does not require |
---|
[994] | 410 | a search for an optimal parameter as in the SOR method. However, the SOR has |
---|
| 411 | been retained because it is a linear solver, which is a very useful property when |
---|
[2282] | 412 | using the adjoint model of \NEMO. |
---|
[707] | 413 | |
---|
[2282] | 414 | At each time step, the time derivative of the sea surface height at time step $t+1$ |
---|
| 415 | (or equivalently the divergence of the \textit{after} barotropic transport) that appears |
---|
| 416 | in the filtering forced is the solution of the elliptic equation obtained from the horizontal |
---|
| 417 | divergence of the vertical summation of \eqref{Eq_PE_flt}. |
---|
| 418 | Introducing the following coefficients: |
---|
| 419 | \begin{equation} \label{Eq_sol_matrix} |
---|
[707] | 420 | \begin{aligned} |
---|
[2282] | 421 | &c_{i,j}^{NS} &&= {2 \rdt }^2 \; \frac{H_v (i,j) \; e_{1v} (i,j)}{e_{2v}(i,j)} \\ |
---|
| 422 | &c_{i,j}^{EW} &&= {2 \rdt }^2 \; \frac{H_u (i,j) \; e_{2u} (i,j)}{e_{1u}(i,j)} \\ |
---|
| 423 | &b_{i,j} &&= \delta_i \left[ e_{2u}M_u \right] - \delta_j \left[ e_{1v}M_v \right]\ , \\ |
---|
[707] | 424 | \end{aligned} |
---|
| 425 | \end{equation} |
---|
[2541] | 426 | the resulting five-point finite difference equation is given by: |
---|
[2282] | 427 | \begin{equation} \label{Eq_solmat} |
---|
| 428 | \begin{split} |
---|
| 429 | c_{i+1,j}^{NS} D_{i+1,j} + \; c_{i,j+1}^{EW} D_{i,j+1} |
---|
| 430 | + c_{i,j} ^{NS} D_{i-1,j} + \; c_{i,j} ^{EW} D_{i,j-1} & \\ |
---|
| 431 | - \left(c_{i+1,j}^{NS} + c_{i,j+1}^{EW} + c_{i,j}^{NS} + c_{i,j}^{EW} \right) D_{i,j} &= b_{i,j} |
---|
| 432 | \end{split} |
---|
| 433 | \end{equation} |
---|
[996] | 434 | \eqref{Eq_solmat} is a linear symmetric system of equations. All the elements of |
---|
| 435 | the corresponding matrix \textbf{A} vanish except those of five diagonals. With |
---|
[707] | 436 | the natural ordering of the grid points (i.e. from west to east and from |
---|
| 437 | south to north), the structure of \textbf{A} is block-tridiagonal with |
---|
| 438 | tridiagonal or diagonal blocks. \textbf{A} is a positive-definite symmetric |
---|
| 439 | matrix of size $(jpi \cdot jpj)^2$, and \textbf{B}, the right hand side of |
---|
| 440 | \eqref{Eq_solmat}, is a vector. |
---|
| 441 | |
---|
[2282] | 442 | Note that in the linear free surface case, the depth that appears in \eqref{Eq_sol_matrix} |
---|
| 443 | does not vary with time, and thus the matrix can be computed once for all. In non-linear free surface |
---|
| 444 | (\key{vvl} defined) the matrix have to be updated at each time step. |
---|
| 445 | |
---|
[707] | 446 | % ------------------------------------------------------------------------------------------------------------- |
---|
| 447 | % Successive Over Relaxation |
---|
| 448 | % ------------------------------------------------------------------------------------------------------------- |
---|
[2282] | 449 | \subsection{Successive Over Relaxation (\np{nn\_solv}=2, \mdl{solsor})} |
---|
[707] | 450 | \label{MISC_solsor} |
---|
| 451 | |
---|
[2282] | 452 | Let us introduce the four cardinal coefficients: |
---|
| 453 | \begin{align*} |
---|
| 454 | a_{i,j}^S &= c_{i,j }^{NS}/d_{i,j} &\qquad a_{i,j}^W &= c_{i,j}^{EW}/d_{i,j} \\ |
---|
| 455 | a_{i,j}^E &= c_{i,j+1}^{EW}/d_{i,j} &\qquad a_{i,j}^N &= c_{i+1,j}^{NS}/d_{i,j} |
---|
| 456 | \end{align*} |
---|
| 457 | where $d_{i,j} = c_{i,j}^{NS}+ c_{i+1,j}^{NS} + c_{i,j}^{EW} + c_{i,j+1}^{EW}$ |
---|
| 458 | (i.e. the diagonal of the matrix). \eqref{Eq_solmat} can be rewritten as: |
---|
| 459 | \begin{equation} \label{Eq_solmat_p} |
---|
| 460 | \begin{split} |
---|
| 461 | a_{i,j}^{N} D_{i+1,j} +\,a_{i,j}^{E} D_{i,j+1} +\, a_{i,j}^{S} D_{i-1,j} +\,a_{i,j}^{W} D_{i,j-1} - D_{i,j} = \tilde{b}_{i,j} |
---|
| 462 | \end{split} |
---|
| 463 | \end{equation} |
---|
| 464 | with $\tilde b_{i,j} = b_{i,j}/d_{i,j}$. \eqref{Eq_solmat_p} is the equation actually solved |
---|
| 465 | with the SOR method. This method used is an iterative one. Its algorithm can be |
---|
| 466 | summarised as follows (see \citet{Haltiner1980} for a further discussion): |
---|
[707] | 467 | |
---|
[994] | 468 | initialisation (evaluate a first guess from previous time step computations) |
---|
[707] | 469 | \begin{equation} |
---|
[2282] | 470 | D_{i,j}^0 = 2 \, D_{i,j}^t - D_{i,j}^{t-1} |
---|
[707] | 471 | \end{equation} |
---|
[994] | 472 | iteration $n$, from $n=0$ until convergence, do : |
---|
[2282] | 473 | \begin{equation} \label{Eq_sor_algo} |
---|
| 474 | \begin{split} |
---|
| 475 | R_{i,j}^n = &a_{i,j}^{N} D_{i+1,j}^n +\,a_{i,j}^{E} D_{i,j+1} ^n |
---|
| 476 | +\, a_{i,j}^{S} D_{i-1,j} ^{n+1}+\,a_{i,j}^{W} D_{i,j-1} ^{n+1} |
---|
| 477 | - D_{i,j}^n - \tilde{b}_{i,j} \\ |
---|
| 478 | D_{i,j} ^{n+1} = &D_{i,j} ^{n} + \omega \;R_{i,j}^n |
---|
| 479 | \end{split} |
---|
[707] | 480 | \end{equation} |
---|
[994] | 481 | where \textit{$\omega $ }satisfies $1\leq \omega \leq 2$. An optimal value exists for |
---|
| 482 | \textit{$\omega$} which significantly accelerates the convergence, but it has to be |
---|
| 483 | adjusted empirically for each model domain (except for a uniform grid where an |
---|
| 484 | analytical expression for \textit{$\omega$} can be found \citep{Richtmyer1967}). |
---|
[2282] | 485 | The value of $\omega$ is set using \np{rn\_sor}, a \textbf{namelist} parameter. |
---|
[994] | 486 | The convergence test is of the form: |
---|
[707] | 487 | \begin{equation} |
---|
[2282] | 488 | \delta = \frac{\sum\limits_{i,j}{R_{i,j}^n}{R_{i,j}^n}} |
---|
| 489 | {\sum\limits_{i,j}{ \tilde{b}_{i,j}^n}{\tilde{b}_{i,j}^n}} \leq \epsilon |
---|
[707] | 490 | \end{equation} |
---|
[994] | 491 | where $\epsilon$ is the absolute precision that is required. It is recommended |
---|
[2282] | 492 | that a value smaller or equal to $10^{-6}$ is used for $\epsilon$ since larger |
---|
| 493 | values may lead to numerically induced basin scale barotropic oscillations. |
---|
| 494 | The precision is specified by setting \np{rn\_eps} (\textbf{namelist} parameter). |
---|
| 495 | In addition, two other tests are used to halt the iterative algorithm. They involve |
---|
| 496 | the number of iterations and the modulus of the right hand side. If the former |
---|
| 497 | exceeds a specified value, \np{nn\_max} (\textbf{namelist} parameter), |
---|
| 498 | or the latter is greater than $10^{15}$, the whole model computation is stopped |
---|
| 499 | and the last computed time step fields are saved in a abort.nc NetCDF file. |
---|
| 500 | In both cases, this usually indicates that there is something wrong in the model |
---|
| 501 | configuration (an error in the mesh, the initial state, the input forcing, |
---|
[994] | 502 | or the magnitude of the time step or of the mixing coefficients). A typical value of |
---|
[2282] | 503 | $nn\_max$ is a few hundred when $\epsilon = 10^{-6}$, increasing to a few |
---|
[994] | 504 | thousand when $\epsilon = 10^{-12}$. |
---|
| 505 | The vectorization of the SOR algorithm is not straightforward. The scheme |
---|
| 506 | contains two linear recurrences on $i$ and $j$. This inhibits the vectorisation. |
---|
[2282] | 507 | \eqref{Eq_sor_algo} can be been rewritten as: |
---|
| 508 | \begin{equation} |
---|
| 509 | \begin{split} |
---|
[707] | 510 | R_{i,j}^n |
---|
[2282] | 511 | = &a_{i,j}^{N} D_{i+1,j}^n +\,a_{i,j}^{E} D_{i,j+1} ^n |
---|
| 512 | +\,a_{i,j}^{S} D_{i-1,j} ^{n}+\,_{i,j}^{W} D_{i,j-1} ^{n} - D_{i,j}^n - \tilde{b}_{i,j} \\ |
---|
| 513 | R_{i,j}^n = &R_{i,j}^n - \omega \;a_{i,j}^{S}\; R_{i,j-1}^n \\ |
---|
| 514 | R_{i,j}^n = &R_{i,j}^n - \omega \;a_{i,j}^{W}\; R_{i-1,j}^n |
---|
| 515 | \end{split} |
---|
[707] | 516 | \end{equation} |
---|
[2282] | 517 | This technique slightly increases the number of iteration required to reach the convergence, |
---|
| 518 | but this is largely compensated by the gain obtained by the suppression of the recurrences. |
---|
[707] | 519 | |
---|
[2282] | 520 | Another technique have been chosen, the so-called red-black SOR. It consist in solving successively |
---|
| 521 | \eqref{Eq_sor_algo} for odd and even grid points. It also slightly reduced the convergence rate |
---|
| 522 | but allows the vectorisation. In addition, and this is the reason why it has been chosen, it is able to handle the north fold boundary condition used in ORCA configuration ($i.e.$ tri-polar global ocean mesh). |
---|
[707] | 523 | |
---|
[2282] | 524 | The SOR method is very flexible and can be used under a wide range of conditions, |
---|
| 525 | including irregular boundaries, interior boundary points, etc. Proofs of convergence, etc. |
---|
| 526 | may be found in the standard numerical methods texts for partial differential equations. |
---|
[707] | 527 | |
---|
| 528 | % ------------------------------------------------------------------------------------------------------------- |
---|
| 529 | % Preconditioned Conjugate Gradient |
---|
| 530 | % ------------------------------------------------------------------------------------------------------------- |
---|
[2282] | 531 | \subsection{Preconditioned Conjugate Gradient (\np{nn\_solv}=1, \mdl{solpcg}) } |
---|
[707] | 532 | \label{MISC_solpcg} |
---|
| 533 | |
---|
| 534 | \textbf{A} is a definite positive symmetric matrix, thus solving the linear |
---|
[994] | 535 | system \eqref{Eq_solmat} is equivalent to the minimisation of a quadratic |
---|
[707] | 536 | functional: |
---|
| 537 | \begin{equation*} |
---|
| 538 | \textbf{Ax} = \textbf{b} \leftrightarrow \textbf{x} =\text{inf}_{y} \,\phi (\textbf{y}) |
---|
| 539 | \quad , \qquad |
---|
| 540 | \phi (\textbf{y}) = 1/2 \langle \textbf{Ay},\textbf{y}\rangle - \langle \textbf{b},\textbf{y} \rangle |
---|
| 541 | \end{equation*} |
---|
| 542 | where $\langle , \rangle$ is the canonical dot product. The idea of the |
---|
[994] | 543 | conjugate gradient method is to search for the solution in the following |
---|
| 544 | iterative way: assuming that $\textbf{x}^n$ has been obtained, $\textbf{x}^{n+1}$ |
---|
| 545 | is found from $\textbf {x}^{n+1}={\textbf {x}}^n+\alpha^n{\textbf {d}}^n$ which satisfies: |
---|
[707] | 546 | \begin{equation*} |
---|
[994] | 547 | {\textbf{ x}}^{n+1}=\text{inf} _{{\textbf{ y}}={\textbf{ x}}^n+\alpha^n \,{\textbf{ d}}^n} \,\phi ({\textbf{ y}})\;\;\Leftrightarrow \;\;\frac{d\phi }{d\alpha}=0 |
---|
[707] | 548 | \end{equation*} |
---|
[1224] | 549 | and expressing $\phi (\textbf{y})$ as a function of \textit{$\alpha $}, we obtain the |
---|
| 550 | value that minimises the functional: |
---|
[707] | 551 | \begin{equation*} |
---|
| 552 | \alpha ^n = \langle{ \textbf{r}^n , \textbf{r}^n} \rangle / \langle {\textbf{ A d}^n, \textbf{d}^n} \rangle |
---|
| 553 | \end{equation*} |
---|
[994] | 554 | where $\textbf{r}^n = \textbf{b}-\textbf{A x}^n = \textbf{A} (\textbf{x}-\textbf{x}^n)$ |
---|
| 555 | is the error at rank $n$. The descent vector $\textbf{d}^n$ s chosen to be dependent |
---|
| 556 | on the error: $\textbf{d}^n = \textbf{r}^n + \beta^n \,\textbf{d}^{n-1}$. $\beta ^n$ |
---|
| 557 | is searched such that the descent vectors form an orthogonal basis for the dot |
---|
| 558 | product linked to \textbf{A}. Expressing the condition |
---|
[707] | 559 | $\langle \textbf{A d}^n, \textbf{d}^{n-1} \rangle = 0$ the value of $\beta ^n$ is found: |
---|
[1224] | 560 | $\beta ^n = \langle{ \textbf{r}^n , \textbf{r}^n} \rangle / \langle {\textbf{r}^{n-1}, \textbf{r}^{n-1}} \rangle$. |
---|
| 561 | As a result, the errors $ \textbf{r}^n$ form an orthogonal |
---|
[994] | 562 | base for the canonic dot product while the descent vectors $\textbf{d}^n$ form |
---|
[707] | 563 | an orthogonal base for the dot product linked to \textbf{A}. The resulting |
---|
| 564 | algorithm is thus the following one: |
---|
| 565 | |
---|
| 566 | initialisation : |
---|
| 567 | \begin{equation*} |
---|
[2282] | 568 | \begin{split} |
---|
| 569 | \textbf{x}^0 &= D_{i,j}^0 = 2 D_{i,j}^t - D_{i,j}^{t-1} \quad, \text{the initial guess } \\ |
---|
| 570 | \textbf{r}^0 &= \textbf{d}^0 = \textbf{b} - \textbf{A x}^0 \\ |
---|
| 571 | \gamma_0 &= \langle{ \textbf{r}^0 , \textbf{r}^0} \rangle |
---|
| 572 | \end{split} |
---|
[707] | 573 | \end{equation*} |
---|
| 574 | |
---|
| 575 | iteration $n,$ from $n=0$ until convergence, do : |
---|
| 576 | \begin{equation} |
---|
| 577 | \begin{split} |
---|
| 578 | \text{z}^n& = \textbf{A d}^n \\ |
---|
[2282] | 579 | \alpha_n &= \gamma_n / \langle{ \textbf{z}^n , \textbf{d}^n} \rangle \\ |
---|
[707] | 580 | \textbf{x}^{n+1} &= \textbf{x}^n + \alpha_n \,\textbf{d}^n \\ |
---|
| 581 | \textbf{r}^{n+1} &= \textbf{r}^n - \alpha_n \,\textbf{z}^n \\ |
---|
| 582 | \gamma_{n+1} &= \langle{ \textbf{r}^{n+1} , \textbf{r}^{n+1}} \rangle \\ |
---|
| 583 | \beta_{n+1} &= \gamma_{n+1}/\gamma_{n} \\ |
---|
| 584 | \textbf{d}^{n+1} &= \textbf{r}^{n+1} + \beta_{n+1}\; \textbf{d}^{n}\\ |
---|
| 585 | \end{split} |
---|
| 586 | \end{equation} |
---|
| 587 | |
---|
| 588 | |
---|
| 589 | The convergence test is: |
---|
| 590 | \begin{equation} |
---|
| 591 | \delta = \gamma_{n}\; / \langle{ \textbf{b} , \textbf{b}} \rangle \leq \epsilon |
---|
| 592 | \end{equation} |
---|
| 593 | where $\epsilon $ is the absolute precision that is required. As for the SOR algorithm, |
---|
[2282] | 594 | the whole model computation is stopped when the number of iterations, \np{nn\_max}, or |
---|
[994] | 595 | the modulus of the right hand side of the convergence equation exceeds a |
---|
| 596 | specified value (see \S\ref{MISC_solsor} for a further discussion). The required |
---|
| 597 | precision and the maximum number of iterations allowed are specified by setting |
---|
[2282] | 598 | \np{rn\_eps} and \np{nn\_max} (\textbf{namelist} parameters). |
---|
[707] | 599 | |
---|
[994] | 600 | It can be demonstrated that the above algorithm is optimal, provides the exact |
---|
[707] | 601 | solution in a number of iterations equal to the size of the matrix, and that |
---|
[994] | 602 | the convergence rate is faster as the matrix is closer to the identity matrix, |
---|
[2282] | 603 | $i.e.$ its eigenvalues are closer to 1. Therefore, it is more efficient to solve |
---|
| 604 | a better conditioned system which has the same solution. For that purpose, |
---|
| 605 | we introduce a preconditioning matrix \textbf{Q} which is an approximation |
---|
| 606 | of \textbf{A} but much easier to invert than \textbf{A}, and solve the system: |
---|
[994] | 607 | \begin{equation} \label{Eq_pmat} |
---|
[707] | 608 | \textbf{Q}^{-1} \textbf{A x} = \textbf{Q}^{-1} \textbf{b} |
---|
| 609 | \end{equation} |
---|
| 610 | |
---|
[994] | 611 | The same algorithm can be used to solve \eqref{Eq_pmat} if instead of the |
---|
| 612 | canonical dot product the following one is used: |
---|
[1224] | 613 | ${\langle{ \textbf{a} , \textbf{b}} \rangle}_Q = \langle{ \textbf{a} , \textbf{Q b}} \rangle$, and |
---|
| 614 | if $\textbf{\~{b}} = \textbf{Q}^{-1}\;\textbf{b}$ and $\textbf{\~{A}} = \textbf{Q}^{-1}\;\textbf{A}$ |
---|
[2282] | 615 | are substituted to \textbf{b} and \textbf{A} \citep{Madec_al_OM88}. |
---|
[1224] | 616 | In \NEMO, \textbf{Q} is chosen as the diagonal of \textbf{ A}, i.e. the simplest form for |
---|
| 617 | \textbf{Q} so that it can be easily inverted. In this case, the discrete formulation of |
---|
[994] | 618 | \eqref{Eq_pmat} is in fact given by \eqref{Eq_solmat_p} and thus the matrix and |
---|
| 619 | right hand side are computed independently from the solver used. |
---|
[707] | 620 | |
---|
| 621 | % ================================================================ |
---|
| 622 | |
---|
| 623 | |
---|
[2282] | 624 | |
---|
[707] | 625 | |
---|
| 626 | |
---|
[2364] | 627 | |
---|
| 628 | |
---|
| 629 | |
---|
| 630 | |
---|
| 631 | |
---|
| 632 | |
---|
| 633 | |
---|