Name and subject of the action
The PI is responsible to closely follow the progress of the action, and especially to contact NEMO project manager if the delay on preview (or review) are longer than the 2 weeks expected.
Summary
Action  Implement 2D tiling (with the LFRA version of NEMO) 

PI(S)  Daley Calvert, Andrew Coward 
Digest  Implement 2D tiling to reduce traffic between main memory and L3 cache 
Dependencies  DO loop macros (2020WP/KERNEL02_Coward_DoLoopMacros_part1), extended haloes (Italo Epicoco, Seb Masson and Francesca Mele), extension of XIOS to accept 2D tiles of data (Yann Meurdesoif & Seb Masson) 
Branch  source:/NEMO/branches/{YEAR}/dev_r{REV}_{ACTION_NAME} 
Previewer(s)  Gurvan Madec 
Reviewer(s)  Gurvan Madec 
Ticket  #2365 
Description
Implement loop tiling over horizontal dimensions (i and j).
Implementation
A trial implementation of tra_ldf_iso is described in this document. It will be revised as described here.
The main code changes in the preferred approach (using public variables) are described below.
Summary of method
The full processor domain (1:jpi, 1:jpj) is split into one or more subdomains (tiles).
To work on a tile, the DO loop macros in do_loop_substitute are modified to use a new set of domain indices. A new subroutine DOM/domain/dom_tile sets the values of these indices and is also used to initialise the tile to the full domain in DOM/domain/dom_init.
A loop over tiles is implemented at the timestepping level in OCE/step/stp. The domain indices for the tile subdomain are set within this loop by dom_tile, then 'unset' (set back to the full domain) after exiting the loop. All DO loops within the tiling loop therefore work on the current tile, instead of the full processor domain.
The number of tiles is determined by the tile lengths, nn_tile_i and nn_tile_j defined in a new namelist namtile, with respect to the full domain.
Branch
dev_r12745_HPC02_Daley_Tiling_trial_public
New subroutines
 dom_tile  Set domain indices
Modified modules
NOTE: the number of affected modules is expected to be much larger in the final implementation
 cfgs/SHARED/namelist_ref  Add namelist namtile
 OCE/DOM/dom_oce  Declare namelist variables
 OCE/DOM/domain  Read namtile namelist and calculate tiling decomposition, add dom_tile, initialise domain indices
 OCE/TRA/traldf  Changes to account for domain indices
 OCE/TRA/traldf_iso  Changes to account for domain indices
 OCE/do_loop_substitute  Implement domain indices
 OCE/par_oce  Declare domain indices and tiling decomposition parameters
 OCE/step  Add tiling loop and set domain indices using dom_tile
 OCE/step_oce  Import dom_tile
Variables
 Global variables
 ntsi, ntsj start index of tile
 ntei, ntej end index of tile
 ntsim1, ntsjm1 start index of tile, minus 1
 nteip1, ntejp1 end index of tile, plus 1
 ntile tile number
 Parameters
 jpnitile, jpnjtile, jpnijtile number of tiles
 Loop indices
 jtile loop over tiles
 Namelist
 ln_tile Logical control on use of tiling
 nn_tile_i, nn_tile_j tile length
 Preprocessor macros
 IND_2D substitution for ALLOCATE or DIMENSION arguments
 Working variables
 iitile, ijtile tile number
 Dummy arguments
 kntile (ntile)
Namelist
! &namtile ! parameters of the tiling ! ln_tile = .false. ! Use tiling (T) or not (F) nn_tile_i = 10 ! Length of tiles in i nn_tile_j = 10 ! Length of tiles in j /
Documentation updates
Preview
Tests
Review
