Version 2 (modified by mathiot, 4 years ago) (diff) |
---|
Name and subject of the action
Last edition: Wikinfo(changed_ts)? by Wikinfo(changed_by)?
The PI is responsible to closely follow the progress of the action, and especially to contact NEMO project manager if the delay on preview (or review) are longer than the 2 weeks expected.
Summary
Action | VALID-05_mathiot_debug |
---|---|
PI(S) | mathiot |
Digest | add a debug interface which print global min/max/sum and a simple checksum |
Dependencies | If any |
Branch | source:/NEMO/branches/{YEAR}/dev_r{REV}_{ACTION_NAME} |
Previewer(s) | Names |
Reviewer(s) | Names |
Ticket | #XXXX |
Description
THe purpose of this action is to add a simple debug interface debug(ctxt,rvar) to print some key number if needed:
- global min/max to assess if the value are physical
- global sum to spot NaN as soon as they appears
- a checksum to catch reproducibility issue as soon as possible. In a test (change of operation order), the glob_sum catch difference 200 time step later.
The suggested check_sum is based on the idea start to appear in the bit at rank 0 and then propagate. So first you need to 'convert' a double to integer using the TRANSFERT function (print the integer corresponding to the bit pattern of the corresonding float), then each processor can do it 'local sum' of MOD(idat,ibig_prime_number) with the same modulo being used after the sum of each element. This only keep the bit from rank 0 to X (depending of the prime number choosen). Then a global modulo sum is done. Test shown that this method catch change of results as soon as it appears.
Discussion at which level it should be implemented need to be discussed (module level, subroutine level, namelist parameter as dbg lvl or by code section (tra, dyn, zdf, sbc ...))
Implementation
...
Documentation updates
...
Preview
...
Tests
...
Review
...