Opened 10 years ago

Closed 9 years ago

Last modified 3 years ago

#679 closed Task (fixed)

mpp scalability

Reported by: acc Owned by: acc
Priority: low Milestone:
Component: OCE Version: release-3.3
Severity: Keywords: MPI MPP scalability
Cc: Review:
MP ready?:
Progress:

Description

Create branch DEV_1879_mpp_sca to introduce mpp scalability improvements. These changes introduce code to minimise the use of the mpi_allgather operation during the north-fold exchanges. PRACE investigators found significant performance gains with similar changes when using large numbers of processors.
See also

Commit History (4)

ChangesetAuthorTimeChangeLog
2899acc2011-10-07T18:26:05+02:00

Branch 2011/dev_r2855_NOCS_mppsca. Applied full coding conventions and added manual entry (Chap_MISC.tex). See #679

2882acc2011-09-30T17:57:57+02:00

Branch 2011/dev_r2855_NOCS_mppsca. Code to avoid the use of MPI_ALLGATHER at the north fold. Prace investigations suggest this can improve scalability for large domain decompositions. This is a completion and replacement of work started on branch DEV_1879_mpp_sca. See #679

2881acc2011-09-30T17:32:27+02:00

Create new branch for the 2011, NOCS.9: MPP scalability development. This is an update of work initially undertaken by PRACE. See #679 and subsequent updates

1925acc2010-06-10T11:37:21+02:00

Create branch DEV_1879_mpp_sca, see ticket #679

Change History (9)

comment:1 Changed 10 years ago by acc

See also wiki:ticket/679?

comment:2 Changed 9 years ago by acc

  • Milestone changed from 2010 Stream 2: Developer Interfaces to 2011 Stream 3: New features
  • Version changed from nemo_v3_2 to nemo_v3_3_1

The algorithm has now been updated and implemented in v3.3.1. A replacement development branch has been created:

2011/dev_r2855_NOCS_mppsca

Description

This branch introduces code to minimise the use of the mpi_allgather operation during the north-fold exchanges. PRACE investigators found significant performance gains with similar changes when using large numbers of processors.

Method
A new routine is introduced into nemogcm.F90 (nemo_northcomms) that uses the existing method to work out which other processors are directly involved in the north fold exchanges. It does this for T,U,V,F and I points. For some choices of ice model, the I-point exchanges will involve some averaging. For this reason, the I-points require two exchanges to ensure the complete stencil is covered.

Once the lists of neighbours have been established, the mpp_lbc_north routines (in lib_mpp.F90) will employ them to only exchange with "active" neighbours. These exchanges populate the same ztab array that the mpi_allgather method uses and then calls the lbc_nfd routine to carry out the fold operation. The difference is that instead of filling the whole ztab array (which requires every northern row processor to communicate with every other northern row processor), only those gridcells that will be folded onto an individual processor's domain are exchanged. The reduction in communication should lead to performance gains when using large numbers of processors.

The current implementation has been successfully tested in standard ORCA2 configurations. Test results are identical with and without the modifications. For these configurations, there is no degradation in performance.

A new namelist logical "ln_nnogather" has been introduced (in nammpp). Setting this .false. (the default) results in no change in behaviour and north-fold exchanges continue to use the established mpi_allgather method. Setting ln_nnogather to .true. will activate the new option.

Users should not see any change in results between these two options but should expect performance improvements for domain decompositions with large jpni values.

Still to do:

Demonstrate and quantify the benefit with ORCA025 and ORCA12.

comment:3 Changed 9 years ago by acc

  • Resolution set to fixed
  • Status changed from new to closed

comment:4 Changed 4 years ago by nicolasmartin

  • Keywords MPI added; mpi removed

comment:5 Changed 4 years ago by nicolasmartin

  • Keywords nemo_v3_3* added

comment:6 Changed 4 years ago by nicolasmartin

  • Milestone 2011 Stream 3: New features deleted

Milestone 2011 Stream 3: New features deleted

comment:7 Changed 4 years ago by nicolasmartin

  • Keywords MPP added; mpp removed

comment:8 Changed 3 years ago by nemo

  • Type changed from Development to Task

Remove 'Development' type

comment:9 Changed 3 years ago by nemo

  • Keywords nemo_v3_3* removed
Note: See TracTickets for help on using tickets.