New URL for NEMO forge! http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.

ticket/0679_mpp_sca (diff) – NEMO

Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of ticket/0679_mpp_sca

Timestamp:: 2010-06-10T12:45:14+02:00 (14 years ago)
Author:: acc
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

ticket/0679_mpp_sca

                       v1
+[[PageOutline]]
+Last edited [[Timestamp]]
+[[BR]]
+'''Author''' : acc
+'''ticket''' : #679
+'''Branch''' : [https://forge.ipsl.jussieu.fr/nemo/browser/branches/DEV_1879_mpp_sca     DEV_1879_mpp_sca ]
+----
+=== Description ===
+This branch introduces code to minimise the use of the mpi_allgather operation during the north-fold exchanges. PRACE investigators found significant
+performance gains with similar changes when using large numbers of processors. [[BR]]
+'''Method'''[[BR]]
+A new routine is introduced into opa.F90 (opa_northcomms) that uses the existing method to work out which
+other processors are directly involved in the north fold exchanges. It does this for T,U,V,F points and uses the
+masks so that the neighbours won't be included if the boundary is wholly land.
+Once those lists have been established, the mpp_lbc_north routines (in lib_mpp.F90) will employ them to only exchange with
+"active" neighbours. These exchanges populate the same ztab array that the mpi_allgather method uses and then calls the
+lbc_nfd routine to carry out the fold operation. The difference is that instead of filling the whole ztab array (which requires
+every northern row processor to communicate with every other northern row processor), only those gridcells that will be folded onto an individual
+processor's domain are exchanged. The reduction in communication should lead to performance gains when using large numbers of
+processors.
+The current implementation has been successfully tested in standard ORCA2 and ORCA1 configurations. Test results are identical with and without the modifications.
+For these configurations, there is no degradation in performance.
+Still to do:
+. Work out how to deal with 'I' points
+. Check that the method works successfully when land-only regions have been discarded (i.e. jpnij /= jpni*jpnj)
+. Demonstrate and quantify the benefit with ORCA025 and ORCA12.
+----
+=== Testing ===
+Testing could consider (where appropriate) other configurations in addition to NVTK].
+||NVTK Tested||!'''NO!'''||
+||Other model configurations||YES||
+||Processor configurations tested||ORCA2:2x2 and 8X4; ORCA1: 8x4 ||
+||If adding new functionality please confirm that the [[BR]]New code doesn't change results when it is switched off [[BR]]and !''works!'' when switched on||YES||
+=== Bit Comparability ===
+||Does this change preserve answers in your tested standard configurations (to the last bit) ?||!'''YES/NO !'''||
+||Does this change bit compare across various processor configurations. (1xM, Nx1 and MxN are recommended)||!'''YES/NO!'''||
+||Is this change expected to preserve answers in all possible model configurations?||!'''YES/NO!'''||
+||Is this change expected to preserve all diagnostics? [[BR]]!,,!''Preserving answers in model runs does not necessarily imply preserved diagnostics. !''||!'''YES/NO!'''||
+If you answered !'''NO!''' to any of the above, please provide further details:
+ * Which routine(s) are causing the difference?
+ * Why the changes are not protected by a logical switch or new section-version
+ * What is needed to achieve regression with the previous model release (e.g. a regression branch, hand-edits etc). If this is not possible, explain why not.
+ * What do you expect to see occur in the test harness jobs?
+ * Which diagnostics have you altered and why have they changed?Please add details here........
+----
+=== System Changes ===
+||Does your change alter namelists?||NO||
+||Does your change require a change in compiler options?||NO||
+----
+=== Resources ===
+!''Please !''summarize!'' any changes in runtime or memory use caused by this change......!''
+----
+=== IPR issues ===
+||Has the code been wholly (100%) produced by NEMO developers staff working exclusively on NEMO?||YES||