Changes between Version 2 and Version 3 of Documentation/UserGuide/valgrind


Ignore:
Timestamp:
2020-03-19T16:54:30+01:00 (4 years ago)
Author:
luyssaert
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Documentation/UserGuide/valgrind

    v2 v3  
    33 
    44== Objective == 
    5 Background of this item: This is a programmer's worst nightmare: your code is crashing, but the crash is not reproducible.  Even with full debug flags, checking array bounds, uninitialized values. You are seeing errors on lines which cannot possibly contain errors, and when you add a WRITE statement, the error moves to a different line. Sound familiar?  You probably have a memory bug.  One common way these occur is when your program attempts to write to memory outside of the block of memory allocated to a particular variable.  For example, my_vector(1:10), and your code attempts to set the value of my_vector(11). Compiler flags catch a lot of these problems (e.g., check-bounds=all).  But compilers are just tools, written by people, and they may be buggy themselves. They may catch 99.9% of bugs, but leave you hanging on the last 0.1%.  Or, the way the code is structured, the compiler may not be able to tell that memory outside of allocated memory is being used. Valgrind is a powerful tool for cases like this. And, thankfully for us, it is currently installed on Obelix.  
     5Background of this item: This is a programmer's worst nightmare: your code is crashing, but the crash is not reproducible. Even with an executable that was compiled with full debug flags, checking array bounds, and checking uninitialized values. You are seeing errors on lines which cannot possibly contain errors, and when you add a WRITE statement, the error moves to a different line. Sound familiar?  You probably have a memory bug.  One common way these occur is when your program attempts to write to memory outside of the block of memory allocated to a particular variable.  For example, my_vector(1:10), and your code attempts to set the value of my_vector(11). Compiler flags catch a lot of these problems (e.g., check-bounds=all).  But compilers are just tools, written by people, and they may be buggy themselves. They may catch 99.9% of bugs, but leave you hanging on the last 0.1%.  Or, the way the code is structured, the compiler may not be able to tell that memory outside of allocated memory is being used. Valgrind is a powerful tool for cases like this. And, thankfully for us, it is currently installed on Obelix.  
    66 
    77== Valgrind on Obelix ==  
    88Authors: M. !McGrath [[BR]]  
    9 Last revision: M. !McGrath (2014/04/18) [[BR]] 
     9Last revision: M. !McGrath (2019/03/08) [[BR]] 
    1010 
    1111 
    12  
    13  
    14  
    15  
    16 Valgrind works by tracking every single piece of memory reference in the code.  As such, it knows if you try to write to (or read from) a bad piece of memory, e.g. something which has already been deallocated or which was never allocated in the first place. 
    17  
    18 The downside is that this is really expensive.  Tracking every piece of memory requires a lot of computational power and memory.  Expect a code run with Valgrind to be at least 10 times slower than a normal debug code (which is already 10 times slower than optimized code).  Just getting to the first timestep of the model takes about 15 minutes, due to loading in maps (from what I can tell).  I ran a 5x6 grid (30 pixels over Europe) for one month, and the run took 3 hours.  I also do not believe it can be run in parallel. 
     12Valgrind works by tracking every single piece of memory reference in the code.  As such, it knows if you try to write to (or read from) a bad piece of memory, e.g. something which has already been deallocated or which was never allocated in the first place. The downside is that this is really expensive.  Tracking every piece of memory requires a lot of computational power and memory.  Expect a code run with Valgrind to be at least 10 times slower than a normal debug code (which is already 10 times slower than optimized code). Just getting to the first timestep of the model takes about 15 minutes, due to loading in maps (from what I can tell).  I ran a 5x6 grid (30 pixels over Europe) for one month, and the run took 3 hours.  I also do not believe it can be run in parallel. 
    1913 
    2014On the other hand, the information you can get is worth its weight in gold.  Here is an example.