Opened 3 years ago

Closed 3 years ago

Last modified 2 years ago

#1821 closed Task (fixed)

Develop optimisations for the GO6 and NOC MEDUSA branches

Reported by: frrh Owned by: timgraham
Priority: low Milestone: Unscheduled
Component: OCE Version:
Severity: Keywords: MEDUSA
Cc: Review:
MP ready?:
Progress:

Description

Context

Develop optimisations for the NOC MEDUSA branch



Optimisations for the NOC MEDUSA branch need to be developed and placed under version control.

From the point of view of best practice these are probably best managed in a branch-of-a-branch
which is what we aim to do here.

As a minimum we need to get changes under version control so that they can be merged into the
master branch after suitable testing and review.

This ticket is created initially to record work done at the Met Office by
Maff Glover with additional input from Marc Stringer and Richard Hill.

Implementation


TBA

Commit History (8)

ChangesetAuthorTimeChangeLog
7771frrh2017-03-09T11:38:31+01:00

Apply optimisations to various areas of code replacing the use of allocated pointers with straightforward direct ALLOCATE and DEALLOCATE operations.

These optimisations largely have an impact in models featuring MEDUSA, i.e. those with significant numbers of tracers, although they are expected to have a small impact in all configurations.

Code developed and tested in NEMO branch branches/UKMO/dev_r5518_optim_GO6_alloc Tested in stand-alone GO6-GSI8, GO6-GSI8-MEDUSA and UKESM coupled models. NEMO ticket #1821 documents this change further.

7601frrh2017-01-24T13:39:27+01:00

#1821 more changes to replace pointers with ALLOCATE in traldf_iso

7581frrh2017-01-19T13:20:22+01:00

#1821. Commit optimisations to replace pointers with allocatable work arrays. This is based on MG's initial work, but I've added traadv_tvd.F90 since this will be applicable more generally to NEMO whilst traadv_muscl.F90 only applies if MEDUSA is active. I've not bothered with fldread.F90 since this is only called a handful of times in any given run so I don't want to increase the potential risk of code clashes with other branches for no measurable gain.

7574frrh2017-01-18T16:35:06+01:00

#1821 Branch of branch to see if we can get generically useful optimisations into GO6 based on Maff Glover's work in MEDUSA configurations to replace pointers and wrk_alloc/wrk_dealloc with direct ALLOCATE and DEALLOCATE statements.

7568frrh2017-01-17T10:17:34+01:00

#1821. Alternative branch for MEDUSA optimisations since previous attempt failed due to huge and insurmountable clashes with GO6 "package" branch.

7543frrh2017-01-10T17:18:18+01:00

#1821. Reverse previous operation (erroneous copy of branch INTO this branch!)

7542frrh2017-01-10T17:10:26+01:00

#1821. Optimisations for MEDUSA developed at the Met Office.

7541frrh2017-01-10T17:01:03+01:00

#1821 Branch of the main NOC NERC MEDUSA_Stable branch for committal of Maff Glover's optimisations and possible development of further optimisations by Marc Stringer and Richard Hill.

Change History (14)

comment:1 Changed 3 years ago by frrh

  • Owner changed from nemo to frrh

comment:2 Changed 3 years ago by frrh

Branch branches/UKMO/MEDUSA_optim_MG_MS_RH created by
making a branch of a branch from the NOC master MEDUSA branch branches/NERC/dev_r5518_NOC_MEDUSA_Stable
thus:

 svn copy svn+ssh://forge.ipsl.jussieu.fr/ipsl/forge/projets/nemo/svn/branches/NERC/dev_r5518_NOC_MEDUSA_Stable 
                   svn+ssh://forge.ipsl.jussieu.fr/ipsl/forge/projets/nemo/svn/branches/UKMO/MEDUSA_optim_MG_MS_RH

comment:3 Changed 3 years ago by frrh

That didn't really work because the suite we're testing in uses a very recent version of the Met Office GO6 package branch which clashes with around half the changes we're applying to the MEDUSA branch. So the code does not even extract unless we exclude half the changes we want to include.

So I've created a different version of the MEDUSA opt branch: /UKMO/dev_r5518_MEDUSA_optim_MG_MS_RH (note the leading dev_r5518 cf my original branch).

This is based on the NOC_MEDUSA_Stable branch at branches/NERC/dev_r5518_NOC_MEDUSA_Stable@7498. The reason for trying this is that this particular branch/revision is in use in a coupled UKESM set up which uses an older version of the GO6 Package branch: branches/UKMO/dev_r5518_GO6_package@7206.

It turns out that we are able to include all Maffs changes and run OK with the exception of traadv.F90 and traadv_iso.F90 which still clash with the GO6 branch.

In fact most of the optimisations are not MEDUSA-specific at all and if anything they should go in the GO6 configuration, which indeed is they only way we can possibly include them anyway!

So we have:

OPA_SRC/SBC/fldread.F90 (2 diffs)
OPA_SRC/SOL/solpcg.F90 (1 diff)
OPA_SRC/TRA/traadv.F90 (1 diff)       - Excluded from test due to GO6 clash
OPA_SRC/TRA/traadv_muscl.F90 (2 diffs)
OPA_SRC/TRA/trabbl.F90 (1 diff)
OPA_SRC/TRA/traldf_iso.F90 (2 diffs)  - Excluded from test due to GO6 clash
TOP_SRC/MEDUSA/trcbio_medusa.F90 (3 diffs)
TOP_SRC/TRP/trcbbl.F90 (1 diff)
TOP_SRC/TRP/trcldf.F90 (1 diff) 

Only the bottom three of these should be applied in the MEDUSA branch (and in fact there's a case to made that only changes to TOP_SRC/MEDUSA/trcbio_medusa.F90 should be applied in the MEDUSA branch).

Getting these into the GO6 package branch is not necessarily straightforward.
1) We need a stand alone branch with the changes in (even if we know they'll clash with the package branch). We definitely do not want to apply things directly to the package branch because then it's not a package branch. We could even do this by creating a branch of a branch…. maybe… not sure about that.
2) We then merge the separate branch into the package branch, applying any manual conflict resolution as necessary.
3) We then test EVERYTHING - GO6 stand-alone and GC3 and UKESM configurations to check for performance and bit comp.

comment:4 Changed 3 years ago by frrh

Running with these changes in a working copy of u-ai927 (u-ai927MEDopt) for 3x10-day cycles, we see no evidence of any improvement in speed, however the chances are we won't because
a) we don't know how well load balanced the initial job is, or, more particularly, if the slack time is in the atmos or ocean component.
b) Cray XC40 times are so varaibale anyway that any improvement may be swamped by noise.

To see any effect we would need to ensure the ocean is the slowest component.

So we need better tests.

comment:5 Changed 3 years ago by frrh

Discussing with Tim, we agree that the best way to proceed is to create a branch from the GO6 package branch, develop changes and test them in that and then merge that back to the GO6 package branch when it's been tested and reviewed properly.

Ideally one would create the branch from r5518 and simply merge into the package branch but some of the things we want to optimise have been changed on the package branch so we can't do that without clashing.
So treating the package branch as a trunk is the next best thing.

wiki:ticket/1821/BranchTesting

Last edited 3 years ago by frrh (previous) (diff)

comment:6 Changed 3 years ago by frrh

I've completed about as much testing as is possible at the moment, using GO6 standard suites and UKESM-based ocean-cice-MEDUSA suites (no atmos coupling).

Branch branches/UKMO/dev_r5518_optim_GO6_alloc@7602 refers

Passing to Tim for review and consideration for inclusion in GO6 package branch.

comment:7 Changed 3 years ago by frrh

  • Owner changed from frrh to timgraham
  • Summary changed from Develop optimisations for the NOC MEDUSA branch to Develop optimisations for the GO6 and NOC MEDUSA branches

comment:8 Changed 3 years ago by timgraham

Most of these changes look fine to me and as long as they don't change results I'm happy for them to go into the GO6 package branch.

Specific comments:
As discussed it would probably be better of all of the allocate statements have the same format: i.e. ALLOCATE( zslpx(jpi, jpj, jpk) ) instead of ALLOCATE( zslpx(1:jpi, 1:jpj, 1:jpk) )

comment:9 Changed 3 years ago by frrh

Following review and discussion with Tim as above, updated changes to use explicit range in modified ALLOCATE statements, universally starting from 1, for consistency and the avoidance of doubt.

Repeated test in u-aj369optalloc working copy. All results bit compare with control run and timings are, unusually, very similar to previous test run.

Revision r7680 refers.

comment:10 Changed 3 years ago by frrh

Following review and discussion with Tim, I have now merged this code with the Met Office GO6 package branch at revision r7771.

comment:11 Changed 3 years ago by frrh

  • Resolution set to fixed
  • Status changed from new to closed

Closing this ticket since this particular work is now completed within the original limited scope.

Deleting svn+ssh://forge.ipsl.jussieu.fr/ipsl/forge/projets/nemo/svn/branches/UKMO/MEDUSA_optim_MG_MS_RH

comment:12 Changed 2 years ago by nemo

  • Type changed from Development to Task

Remove 'Development' type

comment:13 Changed 2 years ago by nemo

  • Keywords Misc. added

comment:14 Changed 2 years ago by nemo

  • Keywords Misc. removed
Note: See TracTickets for help on using tickets.