Changes between Version 1 and Version 2 of 2020WP/VALID-05_mathiot_debug


Ignore:
Timestamp:
2019-12-18T10:49:41+01:00 (10 months ago)
Author:
mathiot
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • 2020WP/VALID-05_mathiot_debug

    v1 v2  
    3232- a checksum to catch reproducibility issue as soon as possible. In a test (change of operation order), the glob_sum catch difference 200 time step later. 
    3333 
    34 The suggested check_sum is based on the idea start to appear in the bit at rank 0 and then propagate. So first you need to 'convert' a double to integer using the TRANSFERT function (print the integer corresponding to the bit pattern of the corresonding float), then each processor can do it 'local sum' of MOD(idat,ibig_prime_number) with the same modulo being used after the sum of each element. This only keep the bit from rank 0 to X (depending of the prime number choosen). Then a global sum is done. 
     34The suggested check_sum is based on the idea start to appear in the bit at rank 0 and then propagate. So first you need to 'convert' a double to integer using the TRANSFERT function (print the integer corresponding to the bit pattern of the corresonding float), then each processor can do it 'local sum' of MOD(idat,ibig_prime_number) with the same modulo being used after the sum of each element. This only keep the bit from rank 0 to X (depending of the prime number choosen). Then a global modulo sum is done. Test shown that this method catch change of results as soon as it appears. 
    3535 
    3636Discussion at which level it should be implemented need to be discussed (module level, subroutine level, namelist parameter as dbg lvl or by code section (tra, dyn, zdf, sbc ...))