New URL for NEMO forge!   http://forge.nemo-ocean.eu

Since March 2022 along with NEMO 4.2 release, the code development moved to a self-hosted GitLab.
This present forge is now archived and remained online for history.
2020WP/HPC-09_epico_Loop_fusion – NEMO
wiki:2020WP/HPC-09_epico_Loop_fusion

Version 3 (modified by epico, 4 years ago) (diff)

--

Name and subject of the action

Last edition: Wikinfo(changed_ts)? by Wikinfo(changed_by)?

The PI is responsible to closely follow the progress of the action, and especially to contact NEMO project manager if the delay on preview (or review) are longer than the 2 weeks expected.

  1. Summary
  2. Preview
  3. Tests
  4. Review

Summary

Action Loop Fusion
PI(S) I.Epicoco, F.Mele
Digest Adoption of the 'loop fusion' optimization tecnique to reduce the cache misses and to improve the computational performance
Dependencies Extra-halo
Branch source:/NEMO/branches/{YEAR}/dev_r{REV}_{ACTION_NAME}
Previewer(s) TBD
Reviewer(s) TBD
Ticket #2367

Description

Error: Failed to load processor box
No macro or processor named 'box' found

The halo management support has been introduced to provide the developer with a tool for specifying different halo sizes for different NEMO Kernels. As an example, we cite the case of dynspg_ts routine where it could be convenient to enlarge the halo region (up to 5 lines or more) to reduce the communication steps in the time sub-stepping loop. In the advection kernels, like in the MUSCL schema, the use of a halo region 2 lines wide allows to remove the communication which currently happens between the slop calculation and the flux calculation; by means of a 2-lines halo, only an exchange is needed at the beginning of the routine and this, in turn, makes the code ready for further optimizations such as tiling ad loop fusion. Moreover, the size of the halo region could be also tuned according to the size of the sub-domain; it is worth remembering that with a wider halo region the overall communication costs reduce but the computation costs grow.

Implementation

Error: Failed to load processor box
No macro or processor named 'box' found

...

Documentation updates

Error: Failed to load processor box
No macro or processor named 'box' found

...

Preview

Error: Failed to load processor box
No macro or processor named 'box' found

...

Tests

Error: Failed to load processor box
No macro or processor named 'box' found

...

Review

Error: Failed to load processor box
No macro or processor named 'box' found

...

Attachments (1)

Download all attachments as: .zip