valgrind HOW-TO

valgrind is a system for debugging and profiling x86-Linux programs, that we use mainly as a memory debugger.

valgrind will report both memory access errors (e.g. dereferencing an uninitialized pointer), and memory leaks (e.g. mismatched new/delete and malloc/free). It does so by running the program in a sort of virtual x86 machine where all memory-related activities are tracked and accounted for. This allows to run any Linux binary using valgrind, no special compilation or linking option required. On the other hand running a program in valgrind may be very slow, even slower than running with a debugger, limiting somewhat its usability to small test cases. The performance of valgrind improves somewhat if one does not turn on the memory leak checking

Usage tips

As mentioned before any Linux executable can be run using valgrind
valgrind /bin/ls
==14585== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux.
==14585== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
==14585== Using valgrind-20031012, a program supervision framework for x86-linux.
==14585== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
==14585== Estimated CPU clock rate is 1000 MHz
==14585== For more details, rerun with: -v
==14585==
==14585== Warning: attempt to set SIGKILL handler in __NR_sigaction.
==14585== Warning: attempt to set SIGSTOP handler in __NR_sigaction.
DavidGaudi.txt   bsubtest.sh  elisp             mail       scratch0
Mail             buildGaudi   gccdcadtbug.html  maxidisk   scripts
architectureWWW  c++          gdb-5.3           mirror     tags
atlas            calaf.names  graph.jpg         private    testLeaksResult1
bgl-book.jpg     cielo.fits   higz_windows.dat  public     valgrind
bin              dolink.sh    html              ring.fits  www
==14585==
==14585== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==14585== malloc/free: in use at exit: 13668 bytes in 60 blocks.
==14585== malloc/free: 156 allocs, 96 frees, 29668 bytes allocated.
==14585== For counts of detected errors, rerun with: -v
==14585== searching for pointers to 60 not-freed blocks.
==14585== checked 4141132 bytes.
==14585==
==14585== 13668 bytes in 60 blocks are still reachable in loss record 1 of 1
==14585==    at 0x40029839: malloc (vg_replace_malloc.c:153)
==14585==    by 0x804F28E: (within /bin/ls)
==14585==    by 0x804E811: (within /bin/ls)
==14585==    by 0x804A47E: (within /bin/ls)
==14585==    by 0x80497E2: (within /bin/ls)
==14585==    by 0x42017588: __libc_start_main (in /lib/i686/libc-2.2.5.so)
==14585==
==14585== LEAK SUMMARY:
==14585==    definitely lost: 0 bytes in 0 blocks.
==14585==    possibly lost:   0 bytes in 0 blocks.
==14585==    still reachable: 13668 bytes in 60 blocks.
==14585==         suppressed: 0 bytes in 0 blocks.
==14585==
If an executable is compiled with debugging on then valgrind will use the debug information to pinpoint the sources of possible errors or leaks:
==2741== 6368 bytes in 6 blocks are possibly lost in loss record 12 of 13
==2741==    at 0x40029CF9: __builtin_new (vg_replace_malloc.c:172)
==2741==    by 0x40029D64: operator new(unsigned) (vg_replace_malloc.c:185)
==2741==    by 0x406B118B: std::string::_Rep::_S_create(unsigned, std::allocator const&) (/scratch/happi/GNU.LANG/gcc-alt-3.2/i686-pc-linux-gnu/libstdc++-v3/include/bits/stl_alloc.h:103)
==2741==    by 0x406B1EF4: char* std::string::_S_construct(char const*, char const*, std::allocator const&, std::forward_iterator_tag) (/scratch/happi/GNU.LANG/gcc-alt-3.2/i686-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:150)
==2741==    by 0x406AE1E3: std::string::string(char const*, std::allocator const&) (/scratch/happi/GNU.LANG/gcc-alt-3.2/i686-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:732)
==2741==    by 0x424129FE: JobOptionsSvc::JobOptionsSvc(std::string const&, ISvcLocator*) (../src/JobOptionsSvc/JobOptionsSvc.cpp:25)

Useful options

Issuing "valgrind --help" will show an extensive list of options. You need to select
 --leak-check=yes 
to enable valgrind most useful functionality, namely leak checking.

Another very useful option is

 --num-callers=N
that specifies how many callers to show in the stack trace of a leak source or of an access error. The default N=4 appears to be insufficient to locate the source of many leaks. A more reasonable value could be
 
 --num-callers=8

The option

--gdb-attach=yes
will start the gdb debugger when an access error is detected. This is usually overkill but it allows to debug relatively quickly some of the thougher problems.

Last but not least, programmers who really care about the quality of their codes should add

 --show-reachable=yes
This flag instructs valgrind to report memory blocks that were not deallocated during the course of the job while being still reachable at the end of the job (meaning where was still an active pointer to them). While this option produces a number of false-positives that will need to be suppressed it allows to locate "slow" memory leaks coming e.g. from static variables, singletons and the like.

The environment variable VALGRIND_OPTS allows to set once and for all the favorite options for a given user/group. For example in sh/bash/zsh

export VALGRIND_OPTS="--leak-check=yes --num-callers=8 --show-reachable=yes"
allows to run with the options mentioned so far turned on by default.

Suppression files

As the examples above show, often valgrind reports possible errors and leaks coming from outside the scope of the program one is trying to debug, typically from some external library one is linking against (cernlib, ROOT but also the c++ standard library). Many of these reports are false positives and in any event there is usually little that can be done to fix them. To reduce the clutter valgrind allows to use suppression files like atlas.supp. For example, compare the original output of an Atlas unit test, which reports many leak candidates in the dlfcn.h implementation, with the "suppressed" one which reports only three "interesting" candidates.

Suppressions can be "compiled in" when building valgrind (and many are by default). Problem-specific suppressions are best put in a separate file and can be turned on using the option

--suppressions=[supp-file]
Suppression files can either be hand-written or, far better, auto-generated by valgrind itself using the flag
--gen-suppressions=yes

valgrind at CERN

valgrind is supported by the LCG SPI group as an external library

valgrind in Atlas

We recommend to set the environment variable VALGRIND_OPTS as described above. We also recommend to define an alias
alias valgrindAtlasSupp='/afs/cern.ch/sw/lcg/external/valgrind/2.0.0/rh73_gcc32/bin/valgrind --suppressions=/afs/cern.ch/sw/lcg/external/valgrind/2.0.0/rh73_gcc32/_SPI/atlas.supp'
and use it when running any non-trivial example or test which uses, for example, ZEBRA libraries.

Links


Draft 1.0, Jan 21 2004 Paolo.Calafiura@cern.ch