Changes between Version 1 and Version 2 of 2020WP/VALID05_mathiot_debug
 Timestamp:
 20191218T10:49:41+01:00 (10 months ago)
Legend:
 Unmodified
 Added
 Removed
 Modified

2020WP/VALID05_mathiot_debug
v1 v2 32 32  a checksum to catch reproducibility issue as soon as possible. In a test (change of operation order), the glob_sum catch difference 200 time step later. 33 33 34 The suggested check_sum is based on the idea start to appear in the bit at rank 0 and then propagate. So first you need to 'convert' a double to integer using the TRANSFERT function (print the integer corresponding to the bit pattern of the corresonding float), then each processor can do it 'local sum' of MOD(idat,ibig_prime_number) with the same modulo being used after the sum of each element. This only keep the bit from rank 0 to X (depending of the prime number choosen). Then a global sum is done.34 The suggested check_sum is based on the idea start to appear in the bit at rank 0 and then propagate. So first you need to 'convert' a double to integer using the TRANSFERT function (print the integer corresponding to the bit pattern of the corresonding float), then each processor can do it 'local sum' of MOD(idat,ibig_prime_number) with the same modulo being used after the sum of each element. This only keep the bit from rank 0 to X (depending of the prime number choosen). Then a global modulo sum is done. Test shown that this method catch change of results as soon as it appears. 35 35 36 36 Discussion at which level it should be implemented need to be discussed (module level, subroutine level, namelist parameter as dbg lvl or by code section (tra, dyn, zdf, sbc ...))