id summary reporter owner description type status priority milestone component version severity resolution keywords cc 2559 dia_prt optimisation... smasson systeam " ==== Context Several bugs/issues/optimisations should be done in dia_prt (in addition of #2529) ==== Analysis There is a set of points Amy, Daley, and I noted (more or less in random order...) 1) the same array must not be used in input and in output of with ptr_* functions. This bug occurs in 2 lines: source:browser/NEMO/trunk/src/OCE/DIA/diaptr.F90:208 {{{#!fortran zmask(1,:,:) = ptr_sjk( zmask(:,:,:), btmsk(:,:,jn) ) }}} source:browser/NEMO/trunk/src/OCE/DIA/diaptr.F90:319 {{{#!fortran z2d(:,:) = ptr_ci_2d( z2d(:,:) ) }}} I guess for zmask, we should simply do as for the other variables: {{{#!fortran DO jn = 1, nptr z4d1(1,:,:,jn) = ptr_sjk( zmask(:,:,:), btmsk(:,:,jn) ) DO ji = 2, jpi z4d1(:,:,:,jn) = z4d1(1,:,:,jn) ENDDO ENDDO }}} 2) the following loops could/should start at 2 and not 1 {{{#!fortran DO ji = 2, jpi xxx(ji,:,:,jn) = xxx(1,:,:,jn) ENDDO }}} 3) we should not write the above loops with the “:” notation as they will be extended as with the ji loop outside of the jj and jk loops… we should with explicitly {{{#!fortran DO jk = 1, jpk DO jj = 1, jpj DO ji = 2, jpi xxx(ji,jj,jk,jn) = xxx(1,jj,jk,jn) ENDDO ENDDO ENDDO }}} '''4) Note that we have 3D arrays (jj,jk,jn) that we duplicate jpi times into 4D arrays (ji,jj,jk,jn). The only motivation of this waist of memory is that we will extract only one unique slice (jj,jk,jn) of this 4D array when writing the data with iol_put (see set_grid_znl in iom.F90)…''' 5) dia_prt uses useless target and pointers : {{{#!fortran REAL(wp), TARGET, ALLOCATABLE, SAVE, DIMENSION(:) :: p_fval1d REAL(wp), TARGET, ALLOCATABLE, SAVE, DIMENSION(:,:) :: p_fval2d }}} in ptr_sj_2d, one can simply define {{{#!fortran REAL(wp), DIMENSION(jpj) :: p_fval }}} and in ptr_sjk, one can simply define {{{#!fortran REAL(wp), DIMENSION(jpj,jpk) :: p_fval }}} 6) ptr_ci_2d is not working when land-only MPI subdomains are suppressed. This can be fixed my creating a North-South communication between subdomains that are separated by suppressed subdomains along the j direction (modification of mppini as it was done for cmpi6). 7) I am not completely sure that the trick for uocetr_vsum_cumul is effectively working... maybe... 8) ijpj variable should be suppressed and replaced everywhere byjpj ==== Fix ==== Recommendation this version of dia_prt originates from the first implementation of XIOS at that time we did not take care of the memory footprint... Instead of porting dia_prt on GPU we could maybe think a little bit more to avoid the array duplication and the waist of memory in this routine… " Defect new low DIA trunk minor