Opened 2 years ago

Closed 2 weeks ago

#2011 closed Task (fixed)

HPC-04(2018WP)_Mocavero_mpi3

Reported by: mocavero Owned by: mocavero
Priority: high Milestone: 2019 WP
Component: OCE Version: trunk
Severity: minor Keywords:
Cc: Review: failed
MP ready?: no
Progress: The neighbourhood collective communications have been integrated in the NEMO code (a dedicated branch has been developed). The use of new collective communications has been tested on a representative kernel implementing the FCT advection scheme. The action will be continued in 2020 to extend new collective communications to 9-point stencil routines, land domain exclusion and north fold exchanges.

Description (last modified by epico)

Context

MPI-3 provides new neighbourhood collective operations (i.e. MPI_Neighbor_allgather and MPI_Neighbor_alltoall) that allow to perform halo exchange with a single MPI communication call when a 5-point stencil is used.

Collective communications will be tested on the NEMO code in order to evaluate the code performance compared with the traditional point-to-point halo exchange currently implemented in NEMO.

The replacement of point-to-point communication with new collective ones will be designed and implemented taking care of the results accuracy.

Implementation plan

The work is described in the following:

Step 1: extraction of a mini-app to be used as test case. The advection kernel has been considered as test case and a mini-app has been implemented. The parallel application performs the MUSCL advection scheme and the dimension of the subdomain as well as the number of parallel processes can be set by the user

Step 2: integration of the new MPI-3 neighbourhood collective communications in the mini-app and performance comparison with the standard MPI-2 point-to-point communications. The evaluation of the proof of concept has been performed by changing the subdomain size. Performance analysis has been executed on system available at CMCC.

Step 3: the neighbourhood collective communications have been integrated in the NEMO code. The first version of the implementation uses a cartesian topology, so it does not support 9-point stencil neither land domain exclusion and the north fold is handled as usual. The use of new collective communications has been tested on a representative kernel implementing the FCT advection scheme.

Modified files are:

  • OCE/LBC/lib_mpp.F90 where the new communicator is created, taking into account the different order of MPI processes between the cartesian communicator and NEMO
  • OCE/LBC/mppini.F90 where the ranks of the processes are reordered and the call to the communicator creation routine has been added
  • OCE/LBC/lbclnk.F90 where generic.h90 files to introduce MPI3 neighbourhood collectives are created
  • OCE/TRA/traadv_fct.F90 as example of routine where MPI3 neighbourhood collectives can be used

Two files have been added:

  • OCE/LBC/lbc_lnk_nc_generic.h90 to handle multi field exchange in MPI3 case
  • OCE/LBC/mpp_nc_generic.h90 where the halo exchange is implemented.

The branch is ready to be merged during 2019 Merge Party. The proposed changes do not impact on NEMO usability. Reference manual will not be changed since code modifications are transparent to the users.

Step 4 (action will be continued in 2020): integration of graph topology to support the routines that use a 9-point stencil, the land domain exclusion and the north fold exchanges through MPI3 neighbourhood collective communications

Commit History (3)

ChangesetAuthorTimeChangeLog
11955mocavero2019-11-22T18:44:17+01:00

Bug fix for MPI3 neighbourhood collectives halo exchange. See ticket #2011

11940mocavero2019-11-20T22:48:28+01:00

Add MPI3 neighbourhood collectives halo exchange in LBC and call it in tracer advection FCT scheme #2011

11496mocavero2019-09-04T10:36:21+02:00

Create HPC-12 branch - ticket #2011

Change History (12)

comment:1 Changed 2 years ago by mocavero

  • Owner set to mocavero
  • Status changed from new to assigned

comment:2 Changed 16 months ago by francesca

  • Description modified (diff)
  • Milestone changed from 2018 WP to 2019 WP
  • Owner changed from mocavero to francesca
  • Progress modified (diff)

comment:3 Changed 15 months ago by nicolasmartin

  • Summary changed from HPC-04_Mocavero_mpi3 to HPC-04(2018WP)_Mocavero_mpi3

comment:4 Changed 12 months ago by nemo

  • Priority changed from low to high

comment:5 Changed 8 months ago by francesca

  • Description modified (diff)

comment:6 Changed 8 months ago by francesca

  • Owner changed from francesca to mocavero

comment:7 Changed 5 months ago by mocavero

In 11496:

Create HPC-12 branch - ticket #2011

comment:8 Changed 5 months ago by francesca

  • Progress modified (diff)

comment:9 Changed 2 months ago by mocavero

In 11940:

Add MPI3 neighbourhood collectives halo exchange in LBC and call it in tracer advection FCT scheme #2011

comment:10 Changed 2 months ago by mocavero

In 11955:

Bug fix for MPI3 neighbourhood collectives halo exchange. See ticket #2011

comment:11 Changed 8 weeks ago by epico

  • Description modified (diff)

comment:12 Changed 2 weeks ago by francesca

  • Progress modified (diff)
  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.