Changeset 994 for trunk/DOC/TexFiles/Chapters/Chap_MISC.tex
- Timestamp:
- 2008-05-28T11:01:09+02:00 (16 years ago)
- Location:
- trunk/DOC/TexFiles
- Files:
-
- 1 copied
- 1 moved
Legend:
- Unmodified
- Added
- Removed
-
trunk/DOC/TexFiles/Chapters/Chap_MISC.tex
r817 r994 13 13 14 14 % In climate modeling, it is often necessary to allow water masses that are separated by land to exchange properties. This situation arises in models when the grid mesh is too coarse to resolve narrow passageways that in reality provide crucial connections between water masses. For example, coarse grid spacing typically closes off the Mediterranean from the Atlantic at the Straits of Gibraltar. In this case, it is important for climate models to include the effects of salty water entering the Atlantic from the Mediterranean. Likewise, it is important for the Mediterranean to replenish its supply of water from the Atlantic to balance the net evaporation occurring over the Mediterranean region. 15 % also same problem as soon as important straits are not properly resolve by the mesh. For example in ORCA 1/4\r{}, this occors in several straits of the Indonesian archipelagos(Ombai, Lombok...).16 % We describe here the three methods that can be used in NEMO to handle such unproperly resolved straits.17 %The communication consists of mixing tracers and mass/volume between non-adjacent water columns. Momentum is not mixed. The scheme conserves total tracer content, and total volume (the latter in $z*$- or $s*$-coordinate), and maintains compatibility between the tracer and mass/volume budgets.18 19 % Note: such changes are so specific to a given configuration that no attempt ha ve been made to set them in a generic way. Nevertheless,example of how they can be set up is given in the ORCA2 configuration (\key{ORCA_R2}).15 % Same problem occurs as soon as important straits are not properly resolved by the mesh. For example it arises in ORCA 1/4\r{} in several straits of the Indonesian archipelago (Ombai, Lombok...). 16 % We describe here the three methods that can be used in \NEMO to handle such improperly resolved straits. 17 %The problem is resolved here by allowing the mixing of tracers and mass/volume between non-adjacent water columns at nominated regions within the model. Momentum is not mixed. The scheme conserves total tracer content, and total volume (the latter in $z*$- or $s*$-coordinate), and maintains compatibility between the tracer and mass/volume budgets. 18 19 % Note: such changes are so specific to a given configuration that no attempt has been made to set them in a generic way. Nevertheless, an example of how they can be set up is given in the ORCA2 configuration (\key{ORCA_R2}). 20 20 21 21 % ------------------------------------------------------------------------------------------------------------- … … 27 27 $\bullet$ reduced scale factor, also called partially open face (Hallberg, personnal communication 2006) 28 28 %also called partially open cell or partial closed cells 29 $\bullet$ increase of the viscous boundary layer by local increase of the fmask value at the coast 29 30 $\bullet$ increase of the viscous boundary layer thickness by local increase of the fmask value at the coast 30 31 31 32 \colorbox{yellow}{Add a short description of scale factor changes staff and fmask increase} … … 36 37 \includegraphics[width=0.80\textwidth]{./Figures/Fig_Gibraltar.pdf} 37 38 \includegraphics[width=0.80\textwidth]{./Figures/Fig_Gibraltar2.pdf} 38 \caption {Example of the Gibraltar strait defined in a 1\r{} x 1\r{} mesh. \textit{Top}: using partially open cells. The meridional scale factor at $V$-point is reduced on both sides of the strait to account for the real width of the strait (about 20 km). Note that the scale factors of the strait $T$-point remains unchanged. \textit{Bottom}: using viscous boundary layers. The four fmask along the strait coastlines are set to a value larger than 4, $i.e.$ "strong" no-slip case (see Fig.\ref{Fig_LBC_shlat}) creating a large viscous boundary layer that allows a reduced transport through the strait.} 39 \caption {Example of the Gibraltar strait defined in a 1\r{} x 1\r{} mesh. 40 \textit{Top}: using partially open cells. The meridional scale factor at $v$-point 41 is reduced on both sides of the strait to account for the real width of the strait 42 (about 20 km). Note that the scale factors of the strait $T$-point remains unchanged. \textit{Bottom}: using viscous boundary layers. The four fmask parameters 43 along the strait coastlines are set to a value larger than 4, $i.e.$ "strong" no-slip 44 case (see Fig.\ref{Fig_LBC_shlat}) creating a large viscous boundary layer 45 that allows a reduced transport through the strait.} 39 46 \end{center} \end{figure} 40 47 %>>>>>>>>>>>>>>>>>>>>>>>>>>>> … … 65 72 \label{MISC_zoom} 66 73 67 The sub-domain functionality, also improperly called zoom option (improperly because it is not associated with a change in model resolution) is a quite simple function that allows to 68 perform a simulation over a sub-domain of an already defined configuration 69 (i.e. without defining a new set of mesh, initial state and forcings). This 70 option can be useful for testing the user setting of surface boundary 71 conditions, or the initial ocean state of a huge ocean model configuration 72 while having a small central memory requirement. It can also be used to 73 easily test specific physics in a sub-domain (for example, test of the 74 coupling between sea-ice and ocean model over the Arctic or Antarctic ocean 75 as they are set in the global ocean version of OPA \citep{Madec1996}. 76 In standard, this option does not include any specific treatment for ocean 77 boundaries of the sub-domain: they are considered as artificial vertical 78 walls. Nevertheless, it is quite easy to add a restoring term toward a 79 climatology in the vicinity of those boundaries (see \S\ref{TRA_dmp}). 74 The sub-domain functionality, also improperly called the zoom option 75 (improperly because it is not associated with a change in model resolution) 76 is a quite simple function that allows a simulation over a sub-domain of an 77 already defined configuration ($i.e.$ without defining a new mesh, initial 78 state and forcings). This option can be useful for testing the user settings 79 of surface boundary conditions, or the initial ocean state of a huge ocean 80 model configuration while having a small computer memory requirement. 81 It can also be used to easily test specific physics in a sub-domain (for example, 82 see \citep{Madec1996} for a test of the coupling used in the global ocean 83 version of OPA between sea-ice and ocean model over the Arctic or Antarctic 84 ocean, using a sub-domain). In the standard model, this option does not 85 include any specific treatment for the ocean boundaries of the sub-domain: 86 they are considered as artificial vertical walls. Nevertheless, it is quite easy 87 to add a restoring term toward a climatology in the vicinity of such boundaries 88 (see \S\ref{TRA_dmp}). 80 89 81 90 In order to easily define a sub-domain over which the computation can be 82 91 performed, the dimension of all input arrays (ocean mesh, bathymetry, 83 forcing, initial state, ...) are defined as \jp{jpidta}, \jp{jpjdta} and \jp{jpkdta} (\mdl{par\_oce} module), while the 84 computational domain is defined through \jp{jpiglo}, \jp{jpjglo} and \jp{jpk} (\mdl{par\_oce} module). When running the model 85 over the whole configuration, the user set \jp{jpiglo}=\jp{jpidta} \jp{jpjglo}=\jp{jpjdta }and \jp{jpk}=\jp{jpkdta}. When running the model 86 over a sub-domain, the user have to provide the size of the sub-domain, 87 (\jp{jpiglo}, \jp{jpjglo}, \jp{jpkglo}), and the indices of the south western corner as \jp{jpizoom} and \jp{jpjzoom} in the \mdl{par\_oce} module (Fig.~\ref{Fig_LBC_zoom}). 88 89 Note that a third set of dimension exist, \jp{jpi}, \jp{jpj} and \jp{jpk} which is actually used to 90 perform the computation. It is set by default to \jp{jpi}=\jp{jpjglo} and \jp{jpj}=\jp{jpjglo}, except for massively 91 parallel computing where the computational domain is laid out on local 92 processor memories following a 2D horizontal splitting (see {\S}IV.2-c) 92 forcing, initial state, ...) are defined as \jp{jpidta}, \jp{jpjdta} and \jp{jpkdta} 93 (\mdl{par\_oce} module), while the computational domain is defined through 94 \jp{jpiglo}, \jp{jpjglo} and \jp{jpk} (\mdl{par\_oce} module). When running the 95 model over the whole domain, the user sets \jp{jpiglo}=\jp{jpidta} \jp{jpjglo}=\jp{jpjdta} 96 and \jp{jpk}=\jp{jpkdta}. When running the model over a sub-domain, the user 97 has to provide the size of the sub-domain, (\jp{jpiglo}, \jp{jpjglo}, \jp{jpkglo}), 98 and the indices of the south western corner as \jp{jpizoom} and \jp{jpjzoom} in 99 the \mdl{par\_oce} module (Fig.~\ref{Fig_LBC_zoom}). 100 101 Note that a third set of dimensions exist, \jp{jpi}, \jp{jpj} and \jp{jpk} which is 102 actually used to perform the computation. It is set by default to \jp{jpi}=\jp{jpjglo} 103 and \jp{jpj}=\jp{jpjglo}, except for massively parallel computing where the 104 computational domain is laid out on local processor memories following a 2D 105 horizontal splitting. % (see {\S}IV.2-c) ref to the section to be updated 93 106 94 107 %>>>>>>>>>>>>>>>>>>>>>>>>>>>> 95 108 \begin{figure}[!ht] \label{Fig_LBC_zoom} \begin{center} 96 109 \includegraphics[width=0.90\textwidth]{./Figures/Fig_LBC_zoom.pdf} 97 \caption { position of a model domain compared to the data input domain when the zoom functionality is used.}110 \caption {Position of a model domain compared to the data input domain when the zoom functionality is used.} 98 111 \end{center} \end{figure} 99 112 %>>>>>>>>>>>>>>>>>>>>>>>>>>>> … … 106 119 \label{MISC_1D} 107 120 108 The 1D model is a stand alone water column based on the 3D NEMO system. It can be applied to the ocean alone or to the ocean-ice system and can include passive tracers or a biogeochemical model. It is set up by defining the \key{cfg\_1d} CPP key. This 1D model is a very useful tool \textit{(a)} to learn about the physics and numerical treatment of vertical mixing processes ; \textit{(b)} to investigate suitable parameterizations of unresolved turbulence (wind steering, langmuir circulation, skin layers) ; \textit{(c)} to compare the behaviour of different mixing vertical scheme ; \textit{(d)} to perform sensitivity study to the vertical diffusion on a particular point of the ocean global domain ; \textit{(d)} to access to specific diagnostics, aside from the standard model variables, because of having small in core memory requirement. 109 110 The methodology is based on the use of the zoom functionality (see \S\ref{MISC_zoom}), and the addition of some specific routines. There is no need of defining a new set of mesh, bathymetry, initial state and forcing, as the 1D model will use those of the configuration it is a zoom of. 111 112 121 The 1D model option simulates a stand alone water column within the 3D \NEMO 122 system. It can be applied to the ocean alone or to the ocean-ice system and can 123 include passive tracers or a biogeochemical model. It is set up by defining the 124 \key{cfg\_1d} CPP key. This 1D model is a very useful tool \textit{(a)} to learn 125 about the physics and numerical treatment of vertical mixing processes ; \textit{(b)} 126 to investigate suitable parameterizations of unresolved turbulence (wind steering, 127 langmuir circulation, skin layers) ; \textit{(c)} to compare the behaviour of different 128 vertical mixing schemes ; \textit{(d)} to perform sensitivity studies on the vertical 129 diffusion at a particular point of the ocean global domain ; \textit{(d)} to produce 130 extra diagnostics, without the large memory requirement of the full 3D model. 131 132 The methodology is based on the use of the zoom functionality (see 133 \S\ref{MISC_zoom}), with some extra routines. There is no need to define a new 134 mesh, bathymetry, initial state or forcing, since the 1D model will use those of the 135 configuration it is a zoom of. 113 136 114 137 % ================================================================ … … 121 144 %-------------------------------------------------------------------------------------------------------------- 122 145 146 % Steven update just bellow... not sure it is what I vant to say 147 %Searching an equilibrium state with an ocean model requires repeated very long time 148 %integrations (each of a few thousand years for a global model). Due to the size of 123 149 Searching an equilibrium state with an ocean model requires very long time 124 150 integration (a few thousand years for a global model). Due to the size of 125 the time step required for numerical stability consideration(less than a126 few hours), this is usually requires a large elapse time. In orderovercome127 this problem, \citet{Bryan1984} introduces a technique that allowsto accelerate128 the spin up to the equilibrium. It consists in usinga larger time step in151 the time step required for numerical stability (less than a 152 few hours), this usually requires a large elapsed time. In order to overcome 153 this problem, \citet{Bryan1984} introduces a technique that is intended to accelerate 154 the spin up to equilibrium. It uses a larger time step in 129 155 the thermodynamic evolution equations than in the dynamic evolution 130 156 equations. It does not affect the equilibrium solution but modifies the 131 157 trajectory to reach it. 132 158 133 The acceleration of convergence is used when \np{nn\_acc}=1. In that case, $\Delta t=rdt$ is 134 the time step of dynamics while $\widetilde{\Delta t}=rdttra$ is the tracer time-step. Both are settled from \np{rdt} and \np{rdttra} namelist parameters. The set of prognostic equations to solve becomes: 159 The acceleration of convergence option is used when \np{nn\_acc}=1. In that case, 160 $\Delta t=rdt$ is the time step of dynamics while $\widetilde{\Delta t}=rdttra$ is the 161 tracer time-step. Their values are set from the \np{rdt} and \np{rdttra} namelist 162 parameters. The set of prognostic equations to solve becomes: 135 163 \begin{equation} \label{Eq_acc} 136 164 \begin{split} … … 142 170 \end{equation} 143 171 144 \citet{Bryan1984} has analysed the consequences of this distorted physics. Free145 waves have a slower phase speed, their meridional structure is slightly172 \citet{Bryan1984} has examined the consequences of this distorted physics. 173 Free waves have a slower phase speed, their meridional structure is slightly 146 174 modified, and the growth rate of baroclinically unstable waves is reduced 147 but there isa wider range of instability. This technique is efficient for148 searching an equilibrium state in coarse resolution models. However its175 but with a wider range of instability. This technique is efficient for 176 searching for an equilibrium state in coarse resolution models. However its 149 177 application is not suitable for many oceanic problems: it cannot be used for 150 178 transient or time evolving problems (in particular, it is very questionable 151 to keep this technique when usinga seasonal cycle in the forcing fields),179 to use this technique when there is a seasonal cycle in the forcing fields), 152 180 and it cannot be used in high-resolution models where baroclinically 153 181 unstable processes are important. Moreover, the vertical variation of 154 $\ Delta \tilde {t}$ implies that the heat and salt contents are no more182 $\widetilde{ \Delta t}$ implies that the heat and salt contents are no longer 155 183 conserved due to the vertical coupling of the ocean level through both 156 184 advection and diffusion. 185 % first mention of depth varying tilda-delta-t! and namelist parameter associated to that are to be described 157 186 158 187 … … 160 189 % Model optimisation, Control Print and Benchmark 161 190 % ================================================================ 162 \section{Model optimisation, Control Print and Benchmark}191 \section{Model Optimisation, Control Print and Benchmark} 163 192 \label{MISC_opt} 164 193 %--------------------------------------------namdom------------------------------------------------------- … … 166 195 %-------------------------------------------------------------------------------------------------------------- 167 196 168 Three points to be described here: 197 % \gmcomment{why not make these bullets into subsections?} 198 199 Three issues to be described here: 169 200 170 201 $\bullet$ Vector and memory optimisation: 171 202 172 \key{vectopt\_loop} enable the internal loop collapse, a very efficient way to increase the length of vector and thus speed up the model on vector computers. 203 \key{vectopt\_loop} enables internal loop collapse, a very efficient way to increase 204 the length of vector calculations and thus speed up the model on vector computers. 173 205 174 Add here also one word on NPROMA technique that has been found useless, since compiler have made significant progress during the last decade.206 % Add here also one word on NPROMA technique that has been found useless, since compiler have made significant progress during the last decade. 175 207 176 Add also one word on NEC specific optimisation (Novercheck option for example) 177 178 \key{vectopt\_memory} has been introduced in order to reduce the memory requirement of the model. This is obviously done by incresing the CPU time requirement, as it suppress intermediate computation saved in in-core memory. This possibility have not been intensively used. In fact up to now, it only concern the TKE physics, in which, when \key{vectopt\_memory} is defined, the coefficient used for horizontal smoothing of $A_v^T$ and $A_v^m$ are no more computed once for all. This reduces the memory requirement by three 2D arrays. 208 % Add also one word on NEC specific optimisation (Novercheck option for example) 209 210 \key{vectopt\_memory} has been introduced in order to reduce the memory 211 requirement of the model. This is obviously done at the cost of increasing the 212 CPU time requirement, since it suppress intermediate computations which would 213 have been saved in in-core memory. This feature has not been intensively used. 214 In fact, currently it is only implemented for the TKE physics, in which, when 215 \key{vectopt\_memory} is defined, the coefficients used for horizontal smoothing 216 of $A_v^T$ and $A_v^m$ are no longer computed once and for all. This reduces 217 the memory requirement by three 2D arrays. 218 179 219 180 $\bullet$ Control print: describe here 4 things: 181 182 1- \np{ln\_ctl} : compute and print the inner domain averaged trends in all TRA, DYN LDF and ZDF modules. Very useful to detect the origin of an undesired change in model results. 183 184 2- also \np{ln\_ctl} but using the nictl. njctl. namelist parameters to check the origin of differences between mono and multi processor 185 186 3- \key{esopa} (to be rename key\_nemo) : also a option of model management. When defined, this key force the activation of all options and CPP keys. For example, all the tracer and momentum advection scheme are called ! There is therefore no physical meaning associated with model results. In fact, this option allows both the compilator and the model run to go through all the Fortran lines of the model. This allows to check is there is obvious compilation or running bugs for CPP options, and running bugs for namelist options. 187 188 5- last digit comparison (\np{nbit\_cmp}). In MPP simulation, the computation of sum of the whole domain is performed as the sum over all processors of the sum of the inner domain of each processor. This double sum nevr give exactly the same result as a single sum over the whole domain, due to truncature differences. The "bit comparison" option has been introduced in order to be able to check that mono-processor and multi-processor give exactly the same results. 189 190 191 $\bullet$ Benchmark (\np{nbench}). This option, defines a benchmark run based on GYRE configuration in which the resolution remains the same whatever the domain size is. This allows to run very large model domain by just changing the domain size (\jp{jpiglo}, \jp{jpjglo}) without adjusting neither the time-step nor the physical parametrisations. 220 $\bullet$ Control print %: describe here 4 things: 221 222 1- \np{ln\_ctl} : compute and print the trends averaged over the interior domain 223 in all TRA, DYN, LDF and ZDF modules. This option is very helpful when 224 diagnosing the origin of an undesired change in model results. 225 226 2- also \np{ln\_ctl} but using the nictl and njctl namelist parameters to check 227 the source of differences between mono and multi processor runs. 228 229 3- \key{esopa} (to be rename key\_nemo) : which is another option for model 230 management. When defined, this key forces the activation of all options and 231 CPP keys. For example, all tracer and momentum advection schemes are called! 232 There is therefore no physical meaning associated with the model results. 233 However, this option forces both the compiler and the model to run through 234 all the \textsc{Fortran} lines of the model. This allows the user to check for obvious 235 compilation or execution errors with all CPP options, and errors in namelist options. 236 237 3- \key{esopa} (to be rename key\_nemo) : which is another option for model 238 management. When defined, this key forces the activation of all options and CPP keys. 239 For example, all tracer and momentum advection schemes are called! There is 240 therefore no physical meaning associated with the model results. However, this option 241 forces both the compiler and the model to run through all the Fortran lines of the model. 242 This allows the user to check for obvious compilation or execution errors with all CPP 243 options, and errors in namelist options. 244 245 4- last digit comparison (\np{nbit\_cmp}). In an MPP simulation, the computation of 246 a sum over the whole domain is performed as the summation over all processors of 247 each of their sums over their interior domains. This double sum never gives exactly 248 the same result as a single sum over the whole domain, due to truncation differences. 249 The "bit comparison" option has been introduced in order to be able to check that 250 mono-processor and multi-processor runs give exactly the same results. 251 252 $\bullet$ Benchmark (\np{nbench}). This option defines a benchmark run based on 253 a GYRE configuration in which the resolution remains the same whatever the domain 254 size. This allows a very large model domain to be used, just by changing the domain 255 size (\jp{jpiglo}, \jp{jpjglo}) and without adjusting either the time-step or the physical 256 parameterizations. 192 257 193 258 … … 202 267 203 268 The computation of the surface pressure gradient with a rigid lid assumption 204 requires to compute $\partial_t \psi$, the time evolution of the 205 barotropic streamfunction. $\partial_t \psi$ is solution of an elliptic 206 equation (I.2.4) for which two solvers are available, a 207 Successive-Over-Relaxation (SOR) or a preconditioned conjugate gradient 208 (PCG) \citep{Madec1988, Madec1990}. The PCG is a very efficient 209 method for solving elliptic equations on vector computers. It is a fast and 210 rather easy to use method, which is an attractive feature for a large number 211 of ocean situations (variable bottom topography, complex coastal geometry, 212 variable grid spacing, islands, open or cyclic boundaries, etc ...). It does 213 not require the search of an optimal parameter as in the SOR method. 214 Nevertheless, the SOR has been kept because it is a linear solver, a very 215 useful property when using the adjoint model of OPA. 216 217 The surface pressure gradient is computed in \mdl{dynspg}. The 218 default option is the use of PCG or SOR 219 depending on \np{nsolv} (namelist parameter). 269 requires the computation of $\partial_t \psi$, the time evolution of the 270 barotropic streamfunction. $\partial_t \psi$ is the solution of an elliptic 271 equation \eqref{Eq_PE_psi} for which three solvers are available: a 272 Successive-Over-Relaxation scheme (SOR), a preconditioned conjugate gradient 273 scheme(PCG) \citep{Madec1988, Madec1990} and a Finite Elements Tearing and 274 Interconnecting scheme (FETI) \citep{Guyon_al_EP99, Guyon_al_CalPar99}. 275 The PCG is a very efficient method for solving elliptic equations on vector computers. 276 It is a fast and rather easy method to use; which are attractive features for a large 277 number of ocean situations (variable bottom topography, complex coastal geometry, 278 variable grid spacing, islands, open or cyclic boundaries, etc ...). It does not require 279 a search for an optimal parameter as in the SOR method. However, the SOR has 280 been retained because it is a linear solver, which is a very useful property when 281 using the adjoint model of \NEMO. The FETI solver is a very efficient method on 282 massively parallel computers. However, it has never been used since OPA 8.0. 283 The current version in \NEMO should not even successfully go through the 284 compilation phase. 285 286 The surface pressure gradient is computed in \mdl{dynspg}. The solver used 287 (PCG or SOR) depending on the value of \np{nsolv} (namelist parameter). 220 288 At each time step the time derivative of the barotropic streamfunction is 221 the solution of (II.2.3). Introducing the following coefficients:289 the solution of \eqref{Eq_psi_total}. Introducing the following coefficients: 222 290 223 291 \begin{equation} … … 229 297 \end{equation} 230 298 231 the five-point finite difference equ ivalent equation (II.2.3)can be rewritten as:299 the five-point finite difference equation \eqref{Eq_psi_total} can be rewritten as: 232 300 233 301 \begin{multline} \label{Eq_solmat} … … 257 325 \textbf{A}). (VII.5.1) can be rewritten as: 258 326 259 \begin{multline} 327 \begin{multline} \label{Eq_solmat_p} 260 328 A_{i,j}^{N} { \left( \frac{\partial \psi}{\partial t } \right) }_{i+1,j } 261 329 +\,A_{i,j}^{E} { \left( \frac{\partial \psi}{\partial t } \right) }_{i ,j+1} \\ … … 266 334 267 335 The SOR method used is an iterative method. Its algorithm can be summarised 268 as follows [see \citet{Haltiner1980} for further discussion]:269 270 initialisation (evaluate a first guess from formertime step computations)336 as follows [see \citet{Haltiner1980} for a further discussion]: 337 338 initialisation (evaluate a first guess from previous time step computations) 271 339 \begin{equation} 272 340 \left( \frac{\partial \psi}{\partial t } \right)_{i,j}^0 = 2{\left( \frac{\partial \psi}{\partial t } \right)_{i,j}^t} - {\left( \frac{\partial \psi}{\partial t } \right)_{i,j}^{t-1}} 273 341 \end{equation} 274 iteration $n ,$ from $n=0 $until convergence, do :342 iteration $n$, from $n=0$ until convergence, do : 275 343 \begin{multline} 276 344 R_{i,j}^n … … 286 354 + \omega \;R_{i,j}^n 287 355 \end{equation} 288 where \textit{$\omega $ }satisfies $1\leq \omega \leq 2$. An optimal value exists for \textit{$\omega $ }which 289 accelerates significantly the convergence, but it has to be adjusted 290 empirically for each model domain, except for an uniform grid where an 291 analytical expression for \textit{$\omega $ }can be found \citep{Richtmyer1967}. This 292 parameter is defined as \textit{sor}, a \textbf{namelist} parameter. The convergence test is of the form: 356 where \textit{$\omega $ }satisfies $1\leq \omega \leq 2$. An optimal value exists for 357 \textit{$\omega$} which significantly accelerates the convergence, but it has to be 358 adjusted empirically for each model domain (except for a uniform grid where an 359 analytical expression for \textit{$\omega$} can be found \citep{Richtmyer1967}). 360 The value of $\omega$ is set using \textit{sor}, a \textbf{namelist} parameter. 361 The convergence test is of the form: 293 362 \begin{equation} 294 363 \delta = \frac{\sum\limits_{i,j}{R_{i,j}^n}{R_{i,j}^n}}{\sum\limits_{i,j}{ \tilde B_{i,j}^n}{\tilde B_{i,j}^n}} \leq \epsilon 295 364 \end{equation} 296 where $\epsilon $ is the absolute precision that is required. It is highly recommended 297 to use a $\epsilon $ smaller or equal to $10^{-2}$ as larger values may leads 298 to numerically induced basin scale barotropic oscillations, and to use a two 299 or three order of magnitude smaller value in eddy resolving configuration. 300 The precision of the solver is not only a numerical parameter, but 301 influences the physics. Indeed, the zero change of kinetic energy due to the 302 work of surface pressure forces in rigid-lid approximation is only achieved 303 at the precision demanded on the solver ({\S}~C.1-e). The precision is 304 specified by setting \textit{eps} (\textbf{namelist parameter}). In addition, two other tests are used to stop the iterative 305 algorithm. They concern the number of iterations and the module of the right 306 hand side. If the former exceeds a specified value, \textit{nmax} (\textbf{namelist parameter}), or the latter is greater than 307 $10^{15}$, the whole model computation is stopped while the last 308 computed time step fields are saved in the standard output file. In both 309 cases, this usually indicates that there is something wrong in the model 310 configuration (error in the mesh, the initial state, the input forcing, or 311 the magnitude of the time step or of the mixing coefficients). A typical 312 value of $nmax$ is a few hundred for $\epsilon = 10^{-2}$, increasing to a few 313 thousand for $\epsilon = 10^{-12}$. 314 315 The vectorization of the SOR algorithm is not straightforward. (VII.5.4) 316 contains two linear recurrences on $i$ and $j$. This inhibits the vectorisation 317 ({\S}~IV.2-a). Therefore (VII.5.4a) has been rewritten as: 365 where $\epsilon$ is the absolute precision that is required. It is recommended 366 that a value smaller or equal to $10^{-3}$ is used for $\epsilon$ since larger 367 values may lead to numerically induced basin scale barotropic oscillations. In fact, 368 for an eddy resolving configuration or in a filtered free surface case, a value three 369 orders of magnitude smaller than this should be used. The precision is specified by 370 setting \textit{eps} (\textbf{namelist parameter}). In addition, two other tests are 371 used to halt the iterative algorithm. They involve the number of iterations and the 372 modulus of the right hand side. If the former exceeds a specified value, \textit{nmax} 373 (\textbf{namelist parameter}), or the latter is greater than $10^{15}$, the whole 374 model computation is stopped and the last computed time step fields are saved 375 in the standard output file. In both cases, this usually indicates that there is something 376 wrong in the model configuration (an error in the mesh, the initial state, the input forcing, 377 or the magnitude of the time step or of the mixing coefficients). A typical value of 378 $nmax$ is a few hundred when $\epsilon = 10^{-6}$, increasing to a few 379 thousand when $\epsilon = 10^{-12}$. 380 The vectorization of the SOR algorithm is not straightforward. The scheme 381 contains two linear recurrences on $i$ and $j$. This inhibits the vectorisation. 382 Therefore it has been rewritten as: 318 383 319 384 \begin{multline} … … 334 399 \end{equation} 335 400 336 If the three equations in (VII.5.6) are solved inside the same do-loop, 337 (VII.5.4a) and (VII.5.6) are strictly equivalent. In the model they are 338 solved successively over the whole domain. The convergence is slower but 339 (VII.5.6a) and (VII.5.6b) are vector loops on $i$-index (the inner loop) and 340 (VII.5.6c) is adapted to Cray vector computers by using a routine from the 341 Cray library (namely the FOLR routine) to solve the first-order linear 342 recurrence. The SOR method is very flexible and can be used under a wide 401 The SOR method is very flexible and can be used under a wide 343 402 range of conditions, including irregular boundaries, interior boundary 344 403 points, etc. Proofs of convergence, etc. may be found in the standard 345 numerical methods texts for partial equations.404 numerical methods texts for partial differential equations. 346 405 347 406 % ------------------------------------------------------------------------------------------------------------- … … 356 415 357 416 \textbf{A} is a definite positive symmetric matrix, thus solving the linear 358 system (VII.5.1)is equivalent to the minimisation of a quadratic417 system \eqref{Eq_solmat} is equivalent to the minimisation of a quadratic 359 418 functional: 360 419 \begin{equation*} … … 364 423 \end{equation*} 365 424 where $\langle , \rangle$ is the canonical dot product. The idea of the 366 conjugate gradient method is to search the solution in the following 367 iterative way: assuming that $\textbf{x}^n$ is obtained, $\textbf{x}^{n+1}$ is searched under the form $\textbf {x}^{n+1}={\textbf {x}}^n+\alpha^n{\textbf {d}}^n$ which satisfies: 425 conjugate gradient method is to search for the solution in the following 426 iterative way: assuming that $\textbf{x}^n$ has been obtained, $\textbf{x}^{n+1}$ 427 is found from $\textbf {x}^{n+1}={\textbf {x}}^n+\alpha^n{\textbf {d}}^n$ which satisfies: 368 428 \begin{equation*} 369 {\textbf{ x}}^{n+1}=\text{inf} _{{\textbf{ y}}={\textbf{ x}}^n+\alpha \,{\textbf{ d}}^n} \,\phi ({\textbf{ y}})\;\;\Leftrightarrow \;\;\frac{d\phi }{d\alpha}=0429 {\textbf{ x}}^{n+1}=\text{inf} _{{\textbf{ y}}={\textbf{ x}}^n+\alpha^n \,{\textbf{ d}}^n} \,\phi ({\textbf{ y}})\;\;\Leftrightarrow \;\;\frac{d\phi }{d\alpha}=0 370 430 \end{equation*} 371 431 and expressing $\phi (\textbf{y})$ as a function of \textit{$\alpha $}, we obtain the value that minimises the functional: … … 373 433 \alpha ^n = \langle{ \textbf{r}^n , \textbf{r}^n} \rangle / \langle {\textbf{ A d}^n, \textbf{d}^n} \rangle 374 434 \end{equation*} 375 where $\textbf{r}^n = \textbf{b}-\textbf{A x}^n = \textbf{A} (\textbf{x}-\textbf{x}^n)$ is the error at rank $n$. The choice of the descent vector $\textbf{d}^n$ depends on the error: $\textbf{d}^n = \textbf{r}^n + \beta^n \,\textbf{d}^{n-1}$. $\beta ^n$ is searched such that the descent vectors form an orthogonal base for the dot product linked to \textbf{A}. Expressing the condition 435 where $\textbf{r}^n = \textbf{b}-\textbf{A x}^n = \textbf{A} (\textbf{x}-\textbf{x}^n)$ 436 is the error at rank $n$. The descent vector $\textbf{d}^n$ s chosen to be dependent 437 on the error: $\textbf{d}^n = \textbf{r}^n + \beta^n \,\textbf{d}^{n-1}$. $\beta ^n$ 438 is searched such that the descent vectors form an orthogonal basis for the dot 439 product linked to \textbf{A}. Expressing the condition 376 440 $\langle \textbf{A d}^n, \textbf{d}^{n-1} \rangle = 0$ the value of $\beta ^n$ is found: 377 $\beta ^n = \langle{ \textbf{r}^n , \textbf{r}^n} \rangle / \langle {\textbf{r}^{n-1}, \textbf{r}^{n-1}} \rangle$. As a result, the errors $ \textbf{r}^n$ form an orthogonal base378 for the canonic dot product while the descent vectors $\textbf{d}^n$ form441 $\beta ^n = \langle{ \textbf{r}^n , \textbf{r}^n} \rangle / \langle {\textbf{r}^{n-1}, \textbf{r}^{n-1}} \rangle$. As a result, the errors $ \textbf{r}^n$ form an orthogonal 442 base for the canonic dot product while the descent vectors $\textbf{d}^n$ form 379 443 an orthogonal base for the dot product linked to \textbf{A}. The resulting 380 444 algorithm is thus the following one: 381 445 382 446 initialisation : 383 384 447 385 448 \begin{equation*} … … 415 478 \end{equation} 416 479 where $\epsilon $ is the absolute precision that is required. As for the SOR algorithm, 417 both the PCG algorithm and the whole model computation are stopped when the 418 number of iteration, $nmax$, or the module of the right hand side exceeds a 419 specified value (see {\S}~VII.5-a for further discussion). The precision and 420 the maximum number of iteration are specified by setting $eps$ and $nmax$ (\textbf{namelist parameters}). 421 422 It can be demonstrated that the algorithm is optimal, provides the exact 480 the whole model computation is stopped when the number of iterations, $nmax$, or 481 the modulus of the right hand side of the convergence equation exceeds a 482 specified value (see \S\ref{MISC_solsor} for a further discussion). The required 483 precision and the maximum number of iterations allowed are specified by setting 484 $eps$ and $nmax$ (\textbf{namelist parameters}). 485 486 It can be demonstrated that the above algorithm is optimal, provides the exact 423 487 solution in a number of iterations equal to the size of the matrix, and that 424 the convergence rate is all the more fast as the matrix is closed to 425 identity (i.e. the eigen values are closed to 1). Therefore, it is more 426 efficient to solve a better conditioned system which has the same solution. 427 For that purpose, we introduce a preconditioning matrix \textbf{Q 428 }which is an approximation of \textbf{A} but much easier to invert 429 than \textbf{A} and solve the system: 430 \begin{equation} 488 the convergence rate is faster as the matrix is closer to the identity matrix, 489 $i.e.$ its eigenvalues are closer to 1. Therefore, it is more efficient to solve a 490 better conditioned system which has the same solution. For that purpose, we 491 introduce a preconditioning matrix \textbf{Q} which is an approximation of 492 \textbf{A} but much easier to invert than \textbf{A}, and solve the system: 493 \begin{equation} \label{Eq_pmat} 431 494 \textbf{Q}^{-1} \textbf{A x} = \textbf{Q}^{-1} \textbf{b} 432 495 \end{equation} 433 496 434 The same algorithm can be used to solve (VII.5.7) if instead of the 435 canonical dot product the following one is used: ${\langle{ \textbf{a} , \textbf{b}} \rangle}_Q = \langle{ \textbf{a} , \textbf{Q b}} \rangle$, and if $\textbf{\~{b}} = \textbf{Q}^{-1}\;\textbf{b}$ and $\textbf{\~{A}} = \textbf{Q}^{-1}\;\textbf{A}$ are substituted to \textbf{b} and \textbf{A} \citep{Madec1988}. In OPA, \textbf{Q} is chosen as the diagonal 436 of\textbf{ A}, i.e. the simplest form for \textbf{Q} so that it can be 437 easily inverted. In this case, the discrete formulation of (VII.5.8) is in 438 fact given by (VII.5.2) and thus the matrix and right hand side are computed 439 independently from the solver used. 497 The same algorithm can be used to solve \eqref{Eq_pmat} if instead of the 498 canonical dot product the following one is used: 499 ${\langle{ \textbf{a} , \textbf{b}} \rangle}_Q = \langle{ \textbf{a} , \textbf{Q b}} \rangle$, 500 and if $\textbf{\~{b}} = \textbf{Q}^{-1}\;\textbf{b}$ and $\textbf{\~{A}} = \textbf{Q}^{-1}\;\textbf{A}$ are substituted to \textbf{b} and \textbf{A} \citep{Madec1988}. 501 In \NEMO, \textbf{Q} is chosen as the diagonal of \textbf{ A}, i.e. the simplest form for \textbf{Q} so that it can be easily inverted. In this case, the discrete formulation of 502 \eqref{Eq_pmat} is in fact given by \eqref{Eq_solmat_p} and thus the matrix and 503 right hand side are computed independently from the solver used. 440 504 441 505 % ------------------------------------------------------------------------------------------------------------- … … 445 509 \label{MISC_solfeti} 446 510 447 FETI is a powerfull solver that was developed by Marc Guyon (ref !!!). It as been just converted from Fortan 77 to 90, but never successfully tested after that. Since then, nobody has found enough time to further investigate the implmenntationof FETI and debug it. 448 449 Its main advantaged is to save a lot of CPU time compared to SOR or PCG algorithm. Nevertheless, its main drawbacks is that the solution is depends on the domain decomposition chosen. Using a different number of processors, the solution is the same at the precision required, but not the same at the computer precision. This make it hard to debug. 511 FETI is a powerful solver that was developed by Marc Guyon \citep{Guyon_al_EP99, 512 Guyon_al_CalPar99}. It has been converted from Fortan 77 to 90, but never 513 successfully tested after that. 514 515 Its main advantage is to save a lot of CPU time when compared to the SOR or PCG 516 algorithms. However, its main drawback is that the solution is dependent on the 517 domain decomposition chosen. Using a different number of processors, the solution 518 is the same at the precision required, but not the same at the computer precision. 519 This make it hard to debug. 450 520 451 521 % ------------------------------------------------------------------------------------------------------------- … … 455 525 \label{MISC_solisl} 456 526 457 The boundary condition used for both solvers is that the time derivative of458 the barotropic streamfunction is zero along all the coastlines. When islands459 are present in the model domain, additional computations must be done to460 determine the barotropic streamfunction with the correct boundary527 The boundary condition used for both recommended solvers is that the time 528 derivative of the barotropic streamfunction is zero along all the coastlines. 529 When islands are present in the model domain, additional computations must 530 be performed to determine the barotropic streamfunction with the correct boundary 461 531 conditions. This is detailed below. 462 532 463 The model does not recognise itself the islands which must be defined using 464 bathymetry information, i.e. $mbathy$ array, equals $-1$ over the first island, $-2$ over 465 the second, ... , $-N$ over the $N^{th}$ island (see {\S}~VII.2-b). The model 466 determines the position of island grid-points and defines a closed contour 467 around each island which will be used to compute the circulation around each 468 island. The closed contour is formed of the ocean grid-points which are the 469 closest to the island. 470 471 First, the island barotropic streamfunctions $\psi_n$ are computed 472 using the PCG (they are solutions of (VII.5.1) with the right-hand side 473 equals to zero and with $\psi_n = 1$ along the island $n$ and $\psi_n = 0$ along the other coastlines) (Note that specifying $1$ as boundary 474 condition on an island for $\psi$ is equivalent to define a 475 specific right hand side for (VII.5.1)~). The precision of this computation 476 can be very high since it is done once. The absolute precision, $epsisl$, and the 477 maximum number of iteration, $nmisl$, are the \textbf{namelist parameters} used for that 478 computation. Their typical values are $epsisl = 10^{-10}$ and $nmisl = 4000$. Then the island matrix A is computed from (I.2.8) and reversed. At 479 each time step, $\psi_0$, the solution of (I.2.4) with $\psi_0 = 0$ along all coastlines, is computed using either SOR or PCG. It has to 480 be noted that the first guess of this computation is defined as in (VII.5.3) 481 except that $\partial_t \psi_0$ is used, not $\partial_t \psi$. Indeed we are 482 computing $\partial_t \psi_0$ which is usually far different from $\partial_t \psi$. Then, it is easy to find the time evolution of the barotropic 483 streamfunction on each island and to deduce $\partial_t \psi$ using (I.2.9) 484 in order to compute the surface pressure gradient. Note that the value of 485 the barotropic streamfunction itself is also computed as the time 486 integration of $\partial_t \psi$ for further diagnostics. 533 The model does not have specialised code for islands. They must instead be 534 identified to the solvers by the user via bathymetry information, i.e. the $mbathy$ 535 array should equal $-1$ over the first island, $-2$ over the second, ... , $-N$ over 536 the $N^{th}$ island. The model determines the position of island grid-points and 537 defines a closed contour around each island which is used to compute the circulation 538 around each one. The closed contour is formed from the ocean grid-points which 539 are the closest to the island. 540 541 First, the island barotropic streamfunctions $\psi_n$ are computed using the SOR 542 or PCG scheme (they are solutions of \eqref{Eq_solmat} with the right-hand side 543 equal to zero and with $\psi_n = 1$ along the island $n$ and $\psi_n = 0$ along 544 the other coastlines) (Note that specifying $1$ as boundary condition on an island 545 for $\psi$ is equivalent to defining a specific right hand side for \eqref{Eq_solmat}). 546 The requested precision of this computation can be very high since it is only 547 performed once. The absolute precision, $epsisl$, and the maximum number of 548 iterations allowed, $nmisl$, are the \textbf{namelist parameters} used for this 549 computation. Their typical values are $epsisl = 10^{-10}$ and $nmisl = 4000$. 550 Then the island matrix A is computed from \eqref{Eq_PE_psi_isl_matrix} and 551 inverted. At each time step, $\psi_0$, the solution of \eqref{Eq_solmat} with $\psi = 0$ 552 along all coastlines, is computed using either SOR or PCG. (It should 553 be noted that the first guess of this computation remains as usual except that 554 $\partial_t \psi_0$ is used, instead of $\partial_t \psi$. Indeed we are 555 computing $\partial_t \psi_0$ which is usually very different from $\partial_t \psi$.) 556 Then, it is easy to find the time evolution of the barotropic streamfunction on each 557 island and to deduce $\partial_t \psi$, and to use it to compute the surface pressure 558 gradient. Note that the value of the barotropic streamfunction itself is also computed 559 as the time integration of $\partial_t \psi$ for further diagnostics. 487 560 488 561 % ================================================================ … … 503 576 and the output file(s). The restart file is used internally by the code when 504 577 the user wants to start the model with initial conditions defined by a 505 previous simulation. It contains all the information that is necessary not506 to introduce changes in the model results (even at the computer precision)507 between a run performed with several restarts and the same run performed in508 one time. It has to be noticed that this requires that the restart file509 contain stwo consecutive time steps for all the prognostic variables, and510 that it is save in the same binary format as the one used by the computer to511 calculate (in particular, 32 bits binary IEEE format for this file must not512 be used). The output listing and file(s) aredefined but should be checked578 previous simulation. It contains all the information that is necessary in 579 order for there to be no changes in the model results (even at the computer 580 precision) between a run performed with several restarts and the same run 581 performed in one step. It should be noted that this requires that the restart file 582 contain two consecutive time steps for all the prognostic variables, and 583 that it is saved in the same binary format as the one used by the computer 584 that is to read it (in particular, 32 bits binary IEEE format must not be used for 585 this file). The output listing and file(s) are predefined but should be checked 513 586 and eventually adapted to the user's needs. The output listing is stored in 514 the $ocean.output$ file. The information are printed all the way through the code on the 515 logical unit $numout$. To locate these prints, use the UNIX command "$grep -i numout^*$" in the source code directory. 587 the $ocean.output$ file. The information is printed from within the code on the 588 logical unit $numout$. To locate these prints, use the UNIX command 589 "$grep -i numout^*$" in the source code directory. 516 590 517 591 In the standard configuration, the user will find the model results in two 518 output files for every time-step where output is demanded: a \textbf{VO} 519 file containing all the three dimensional fields in logical unit $numwvo$, and a 520 \textbf{SO} file containing all the two dimensional (horizontal) fields in 521 logical unit $numwso$. These outputs are defined in the $diawri.F$ routine. The standard and available on-line diagnostics are described in {\S}III-12-c. 522 523 The default output is 32 bits binary IEEE format, compatible with the 524 Vairmer software (see the Climate Modelling and global Change team WEB 525 server at CERFACS: http://www.cerfacs.fr). The model's reference directory 526 also contains a visualisation tool based on \textbf{NCAR Graphics 527 }(http://ngwww.ucar.edu). If a user has access to the NCAR software, he or 528 she can copy the \textbf{LODMODEL/UTILS/OPADRA} directory from the reference 529 and, following the \textbf{README}, create the graphic outputs from the 530 model's results. 592 output files containing values for every time-step where output is demanded: 593 a \textbf{VO} file containing all the three dimensional fields in logical unit 594 $numwvo$, and a \textbf{SO} file containing all the two dimensional (horizontal) 595 fields in logical unit $numwso$. These outputs are defined in the $diawri.F$ 596 routine. The default and user-selectable diagnostics are described in {\S}III-12-c. 597 598 The default output (for all output files apart from the restart file) is 32 bits binary 599 IEEE format, compatible with the Vairmer software (see the Climate Modelling 600 and global Change team WEB server at CERFACS: http://www.cerfacs.fr). 601 The model's reference directory also contains a visualisation tool based on 602 \textbf{NCAR Graphics} (http://ngwww.ucar.edu). If a user has access to the 603 NCAR software, he or she can copy the \textbf{LODMODEL/UTILS/OPADRA} 604 directory from the reference and, following the \textbf{README}, create 605 graphical outputs from the model's results. 531 606 } 532 607 … … 534 609 % Tracer/Dynamics Trends 535 610 % ------------------------------------------------------------------------------------------------------------- 536 \subsection{Tracer/Dynamics Trends (\key{trdlmd}, \key{diatrdtra} )}611 \subsection{Tracer/Dynamics Trends (\key{trdlmd}, \key{diatrdtra}, \key{diatrddyn})} 537 612 \label{MISC_tratrd} 538 613 539 614 %to be udated this corresponds to OPA8 540 When \key{diatrddyn} and/or \key{diatrddyn} cpp 541 variables are defined, each trends of the dynamics and/or temperature and 542 salinity time evolution equations are stored in three-dimensional arrays 543 just after their computation (i.e. at the end of each $dyn\cdots .F$ 544 and/or $tra\cdots .F$ routines). These trends are then used in diagnostic 545 routines $diadyn.F$ and $diatra.F$ respectively. In standard, these routines check the basin averaged properties of the momentum and tracer equations every \textit{ntrd } time-steps (\textbf{namelist parameter}). These routines 546 are given as an example; they must be adapted by the user to his/her 547 desiderata. 548 549 These two options imply the definition of several arrays in the in core 550 memory, increasing quite sensitively the code memory requirements (search 551 for \key{diatrddyn} or 552 \key{diatrdtra}in \textit{common.h} file) 615 When \key{diatrddyn} and/or \key{diatrddyn} cpp variables are defined, each 616 trend of the dynamics and/or temperature and salinity time evolution equations 617 is stored in three-dimensional arrays just after their computation ($i.e.$ at the end 618 of each $dyn\cdots .F90$ and/or $tra\cdots .F90$ routine). These trends are then 619 used in diagnostic routines $diadyn.F90$ and $diatra.F90$ respectively. 620 In the standard model, these routines check the basin averaged properties of 621 the momentum and tracer equations every \textit{ntrd } time-steps (\textbf{namelist 622 parameter}). These routines are supplied as an example; they must be adapted by 623 the user to his/her requirements. 624 625 These two options imply the creation of several extra arrays in the in-core 626 memory, increasing quite seriously the code memory requirements. 553 627 554 628 % ------------------------------------------------------------------------------------------------------------- … … 561 635 %-------------------------------------------------------------------------------------------------------------- 562 636 563 \colorbox{yellow}{a description is to be added here} 637 The on-line computation of floats adevected either by the three dimensional velocity 638 field or constraint to remain at a given depth ($w = 0$ in the computation) have been introduced in the system during the CLIPPER project. The algorithm used is based on 639 the work of \cite{Blanke_Raynaud_JPO97}. (see also the web site describing the off-line 640 use of this marvellous diagnostic tool (http://stockage.univ-brest.fr/~grima/Ariane/). 564 641 565 642 % ------------------------------------------------------------------------------------------------------------- … … 569 646 \label{MISC_diag_others} 570 647 571 %To be updated this corresponds to OPA 8 572 573 Aside from the standard model variables, some other diagnostics are computed 574 on-line or can be added in the model. The available ready-to-add diagnostics 575 can be found in DIA. Among the available diagnostics one can 576 quote: 648 %To be updated this mainly corresponds to OPA 8 649 650 Aside from the standard model variables, other diagnostics can be computed 651 on-line or can be added to the model. The available ready-to-add diagnostics 652 routines can be found in directory DIA. Among the available diagnostics are: 577 653 578 654 - the mixed layer depth (based on a density criterion) (\mdl{diamxl})
Note: See TracChangeset
for help on using the changeset viewer.