Context Navigation

← Previous Change
Wiki History
Next Change →

valgrind

Timestamp:: 2020-03-19T16:54:30+01:00 (4 years ago)
Author:: luyssaert
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

Documentation/UserGuide/valgrind

-                      v2
+                      v3
 == Objective ==
 Background of this item: This is a programmer's worst nightmare: your code is crashing, but the crash is not reproducible.  Even with full debug flags, checking array bounds, uninitialized values. You are seeing errors on lines which cannot possibly contain errors, and when you add a WRITE statement, the error moves to a different line. Sound familiar?  You probably have a memory bug.  One common way these occur is when your program attempts to write to memory outside of the block of memory allocated to a particular variable.  For example, my_vector(1:10), and your code attempts to set the value of my_vector(11). Compiler flags catch a lot of these problems (e.g., check-bounds=all).  But compilers are just tools, written by people, and they may be buggy themselves. They may catch 99.9% of bugs, but leave you hanging on the last 0.1%.  Or, the way the code is structured, the compiler may not be able to tell that memory outside of allocated memory is being used. Valgrind is a powerful tool for cases like this. And, thankfully for us, it is currently installed on Obelix.
+Background of this item: This is a programmer's worst nightmare: your code is crashing, but the crash is not reproducible. Even with an executable that was compiled with full debug flags, checking array bounds, and checking uninitialized values. You are seeing errors on lines which cannot possibly contain errors, and when you add a WRITE statement, the error moves to a different line. Sound familiar?  You probably have a memory bug.  One common way these occur is when your program attempts to write to memory outside of the block of memory allocated to a particular variable.  For example, my_vector(1:10), and your code attempts to set the value of my_vector(11). Compiler flags catch a lot of these problems (e.g., check-bounds=all).  But compilers are just tools, written by people, and they may be buggy themselves. They may catch 99.9% of bugs, but leave you hanging on the last 0.1%.  Or, the way the code is structured, the compiler may not be able to tell that memory outside of allocated memory is being used. Valgrind is a powerful tool for cases like this. And, thankfully for us, it is currently installed on Obelix.
 == Valgrind on Obelix ==
 Authors: M. !McGrath [[BR]]
 Last revision: M. !McGrath (2014/04/18) [[BR]]
+Last revision: M. !McGrath (2019/03/08) [[BR]]
+Valgrind works by tracking every single piece of memory reference in the code.  As such, it knows if you try to write to (or read from) a bad piece of memory, e.g. something which has already been deallocated or which was never allocated in the first place.
+The downside is that this is really expensive.  Tracking every piece of memory requires a lot of computational power and memory.  Expect a code run with Valgrind to be at least 10 times slower than a normal debug code (which is already 10 times slower than optimized code).  Just getting to the first timestep of the model takes about 15 minutes, due to loading in maps (from what I can tell).  I ran a 5x6 grid (30 pixels over Europe) for one month, and the run took 3 hours.  I also do not believe it can be run in parallel.
+Valgrind works by tracking every single piece of memory reference in the code.  As such, it knows if you try to write to (or read from) a bad piece of memory, e.g. something which has already been deallocated or which was never allocated in the first place. The downside is that this is really expensive.  Tracking every piece of memory requires a lot of computational power and memory.  Expect a code run with Valgrind to be at least 10 times slower than a normal debug code (which is already 10 times slower than optimized code). Just getting to the first timestep of the model takes about 15 minutes, due to loading in maps (from what I can tell).  I ran a 5x6 grid (30 pixels over Europe) for one month, and the run took 3 hours.  I also do not believe it can be run in parallel.
 On the other hand, the information you can get is worth its weight in gold.  Here is an example.