Opened 3 months ago

Last modified 6 weeks ago

#2559 new Defect

dia_prt optimisation...

Reported by: smasson Owned by: systeam
Priority: low Milestone:
Component: DIA Version: trunk
Severity: minor Keywords:
Cc: Branch review: failed
MP ready?: no Task progress: Unspecified

Description

Context

Several bugs/issues/optimisations should be done in dia_prt (in addition of #2529)

Analysis

There is a set of points Amy, Daley, and I noted (more or less in random order…)

1) the same array must not be used in input and in output of with ptr_* functions.
This bug occurs in 2 lines:

source:browser/NEMO/trunk/src/OCE/DIA/diaptr.F90:208

zmask(1,:,:) = ptr_sjk( zmask(:,:,:), btmsk(:,:,jn) )

source:browser/NEMO/trunk/src/OCE/DIA/diaptr.F90:319

z2d(:,:) = ptr_ci_2d( z2d(:,:) ) 

I guess for zmask, we should simply do as for the other variables:

DO jn = 1, nptr
   z4d1(1,:,:,jn) = ptr_sjk( zmask(:,:,:), btmsk(:,:,jn) )
   DO ji = 2, jpi
      z4d1(:,:,:,jn) = z4d1(1,:,:,jn)
   ENDDO
ENDDO

2) the following loops could/should start at 2 and not 1

DO ji = 2, jpi
   xxx(ji,:,:,jn) = xxx(1,:,:,jn)
ENDDO

3) we should not write the above loops with the “:” notation as they will be extended as with the ji loop outside of the jj and jk loops…
we should with explicitly

DO jk = 1, jpk 
   DO jj = 1, jpj
      DO ji = 2, jpi
         xxx(ji,jj,jk,jn) = xxx(1,jj,jk,jn)
      ENDDO
   ENDDO
ENDDO

4) Note that we have 3D arrays (jj,jk,jn) that we duplicate jpi times into 4D arrays (ji,jj,jk,jn). The only motivation of this waist of memory is that we will extract only one unique slice (jj,jk,jn) of this 4D array when writing the data with iol_put (see set_grid_znl in iom.F90)…

5) dia_prt uses useless target and pointers :

REAL(wp), TARGET, ALLOCATABLE, SAVE, DIMENSION(:)   :: p_fval1d
REAL(wp), TARGET, ALLOCATABLE, SAVE, DIMENSION(:,:) :: p_fval2d

in ptr_sj_2d, one can simply define

REAL(wp), DIMENSION(jpj) :: p_fval

and in ptr_sjk, one can simply define

REAL(wp), DIMENSION(jpj,jpk) :: p_fval

6) ptr_ci_2d is not working when land-only MPI subdomains are suppressed. This can be fixed my creating a North-South communication between subdomains that are separated by suppressed subdomains along the j direction (modification of mppini as it was done for cmpi6).

7) I am not completely sure that the trick for uocetr_vsum_cumul is effectively working… maybe…

8) ijpj variable should be suppressed and replaced everywhere byjpj

Fix

Recommendation

this version of dia_prt originates from the first implementation of XIOS at that time we did not take care of the memory footprint…

Instead of porting dia_prt on GPU we could maybe think a little bit more to avoid the array duplication and the waist of memory in this routine…

Commit History (0)

(No commits)

Change History (2)

comment:1 Changed 2 months ago by hadcv

I address points 1 (the first bug), 5 and 8 in my reorganisation of diaptr.F90 for the tiling:

http://forge.ipsl.jussieu.fr/nemo/log/NEMO/branches/2020/dev_r13383_HPC-02_Daley_Tiling/src/OCE/DIA/diaptr.F90

Point 2 will be addressed in my next commit.

comment:2 Changed 6 weeks ago by hadcv

Points 1 (first bug only), 2, 5 and 8 are addressed in [13982]

Last edited 6 weeks ago by hadcv (previous) (diff)
Note: See TracTickets for help on using tickets.