A Cross-Platform Memory Leak Detector

Memory leakage has been a permanent annoyance for C/C++ programmers. Under MSVC, one useful feature of MFC is report memory leaks at the exit of an application (to the debugger output window, which can be displayed by the integration environment or a debugger). Under GCC, current available tools like mpatrol are relatively difficult to use, or have a big impact on memory/performance. This article details the implementation of an easy-to-use, cross-platform C++ memory leak detector (which I call debug_new), and discusses the related technical issues.

Basic usage

Let’s look at the following simple program test.cpp:

int main()
{
    int* p1 = new int;
    char* p2 = new char[10];
    return 0;
}

Our basic objectives are, of course, report two memory leaks. It is very simple: just compile and link debug_new.cpp. For example:

cl -GX test.cpp debug_new.cpp (MSVC)
g++ test.cpp debug_new.cpp -o test (GCC)

The running output is like follows:

Leaked object at 00341008 (size 4, <Unknown>)
Leaked object at 00341CA0 (size 10, <Unknown>)

If we need clearer reports, it is also trivial: just put this at the front of test.cpp:

#include "debug_new.h"

The output after adding this line is:

Leaked object at 00340FB8 (size 10, test5.cpp:5)
Leaked object at 00340F80 (size 4, test5.cpp:4)

Very simple, isn’t it?

Background knowledge

In a new/delete operation, C++ compilers generates calls to operator new and operator delete (allocation and deallocation functions) for the user. The prototypes of operator new and operator delete are as follows:

void* operator new(size_t) throw(std::bad_alloc);
void* operator new[](size_t) throw(std::bad_alloc);
void operator delete(void*) throw();
void operator delete[](void*) throw();

For new int, the compiler will generate a call to “operator new(sizeof(int))”, and for new char[10], “operator new(sizeof(char) * 10)”. Similarly, for delete ptr and delete[] ptr, the compiler will generate calls to “operator delete(ptr)” and “operator delete[](ptr)”. When the user does not define these operators, the compiler will provide their definitions automatically; when the user do define them, they will override the ones the compiler provides. And we thus get the ability to trace and control dynamic memory allocation.

In the meanwhile, we can adjust the behaviour of new operators with new-placements, which are to supply additional arguments to the allocation functions. E.g., when we have a prototype

void* operator new(size_t size, const char* file, int line);

we may use new ("hello", 123) int to generate a call to “operator new(sizeof(int), "hello", 123)”. This can be very flexible. One placement allocation function that the C++ standard ([C++1998]) requires is

void* operator new(size_t size, const std::nothrow_t&) throw();

in which nothrow_t is usually an empty structure (defined as “struct nothrow_t {};”), whose sole purpose is to provide a type that the compiler can identify for overload resolution. Users can call it via new (std::nothrow) type (nothrow is a constant of type nothrow_t). The difference from the standard new is that when memory allocation fails, new will throw an exception, but new(std::nothrow) will return a null pointer.

One thing to notice is that there is not a corresponding syntax like delete(std::nothrow) ptr. However, a related issue will be mentioned later in this article.

For more information about the above-mentioned C++ language features, please refer to [Stroustrup1997], esp. sections 6.2.6, 10.4.11, 15.6, 19.4.5, and B.3.4. These features are key to understanding the implementation described below.

Principle and basic implementation

Similar to some other memory leakage detectors, debug_new overrides operator new, and provides macros to do substitues in user’s programs. The relevant part in debug_new.h is as follows:

void* operator new(size_t size, const char* file, int line);
void* operator new[](size_t size, const char* file, int line);
#define DEBUG_NEW new(__FILE__, __LINE__)
#define new DEBUG_NEW

Let’s look at the test.cpp after including debug_new.h: new char[10] will become “new("test.cpp", 4) char[10]” after preprocessing, and the compiler will generate a call to “operator new[](sizeof(char) * 10, "test.cpp", 4)” accordingly. If I define “operator new(size_t, const char*, int)” and “operator delete(void*)” (as well as “operator new[]...” and “operator delete[]...”; for clarity, my discussions about operator new and operator delete also cover operator new[] and operator delete[] without mentioning specifically, unless noted otherwise) in debug_new.cpp, I can trace all dynamic memory allocation/deallocation calls, and check for unmatched news and deletes. The implementation may be as simple as using just a map: add a pointer to map in new, and delete the pointer and related information in delete; report wrong deleting if the pointer to delete does not exist in the map; report memory leaks if there are still pointers to delete in the map at program exit.

However, it will not work if debug_new.h is not included. And the case that some translation units include debug_new.h and some do not are unacceptable, for although two operator news are used — “operator new(size_t, const char*, int)” and “operator new(size_t)” — there is only one operator delete! The operator delete we define will consider it an invalid pointer, when given a pointer returned by “operator delete(void*)” (no information about it exists in the map). We are facing a dilemma: either to misreport in this case, or not to report when deleting a pointer twice: none is satisfactory behaviour.

So defining the global “operator new(size_t)” is inevitable. In debug_new.h, I have

void* operator new(size_t size)
{
    return operator new(size, "<Unknown>", 0);
}

Implement the memory leak detector as I have described, you will find it works under some environments (say, GCC 2.95.3 w/ SGI STL), but crashes under others (MSVC 6 is among them). The reason is not complicated: memory pools are used in SGI STL, and only large chunks of memory will be allocated by operator new; in STL implementations which do not utilize such mechanisms, adding data to map will cause a call to operator new, which will add data to map, and this dead loop will immediately cause a stack overflow that aborts the application. Therefore I have to stop using the convenient STL container and resort to my own data structure:

struct new_ptr_list_t
{
    new_ptr_list_t*     next;
    const char*         file;
    int                 line;
    size_t              size;
};

Every time one allocates memory via new, sizeof(new_ptr_list_t) more bytes will be allocated when calling malloc. The memory blocks will be chained together as a linked list (via the next field), the file name, line number, and object size will be stored in the file, line, and size fields, and return (pointer-returned-by-malloc + sizeof(new_ptr_list_t)). When one deletes a pointer, it will be matched with those in the linked list. If it does match — pointer-to-delete == (char*)pointer-in-linked-list + sizeof(new_ptr_list_t) — the linked list will be adjusted and the memory deallocated. If no match is found, a message of deleting an invalid pointer will be printed and the application will be aborted.

In order to automatically report memory leaks at program exit, I construct a static object (C++ ensures that its constructor will be called at program initialization, and the destructor be called at program exit), whose destructor will call a function to check for memory leaks. Users are also allowed to call this function manually.

Thus is the basic implementation.

Improvements on usability

The above method worked quite well, until I began to create a large number of objects. Since each delete needed to search in the linked list, and the average number of searches was a half of the length of the linked list, the application soon crawled. The speed was too slow even for the purpose of debugging. So I made a modification: the head of the linked list is changed from a single pointer to an array of pointers, and which element a pointer belongs to depends on its hash value. — Users are allowed to change the definitions of _DEBUG_NEW_HASH and _DEBUG_NEW_HASHTABLESIZE (at compile-time) to adjust the behaviour of debug_new. Their current values are what I feel satisfactory after some tests.

I found in real use that under some special circumstances the pointers to file names can become invalid (check the comment in debug_new.cpp if you are interested). Therefore, currently the default behaviour of debug_new is copying the first 20 characters of the file name, instead of storing the pointer to the file name. Also notice that the length of the original new_ptr_list_t is 16 bytes, and the current length is 32 bytes: both can ensure correct memory alignments.

In order to ensure debug_new can work with new(std::nothrow), I overloaded “void* operator new(size_t size, const std::nothrow_t&) throw()” too; otherwise the pointer returned by a new(std::nothrow) will be considered an invalid pointer to delete. Since debug_new does not throw exceptions (the program will report an alert and abort when memory is insufficient), this overload just calls operator new(size_t). Very simple.

It has been mentioned previously that a C++ file should include debug_new.h to get an accurate memory leak report. I usually do this:

#ifdef _DEBUG
#include "debug_new.h"
#endif

The include position should be later than the system headers, but earlier than user’s own header files if possible. Typically debug_new.h will conflict with STL header files if included earlier. Under some circumstances one may not want debug_new to redefine new; it could be done by defining _DEBUG_NEW_REDEFINE_NEW to 0 before including debug_new.h. Then the user should also use DEBUG_NEW instead of new. Maybe one should write this in the source:

#ifdef _DEBUG
#define _DEBUG_NEW_REDEFINE_NEW 0
#include "debug_new.h"
#else
#define DEBUG_NEW new
#endif

and use DEBUG_NEW where memory tracing is needed (consider global substitution).

Users might choose to define _DEBUG_NEW_EMULATE_MALLOC, and debug_new.h will emulate malloc and free with debug_new and delete, causing malloc and free in a translation unit including debug_new.h to be traced. Three global variables are used to adjust the behaviour of debug_new: new_output_fp, default to stderr, is the stream pointer to output information about memory leaks (traditional C streams are preferred to C++ iostreams since the former is simpler, smaller, and has a longer and more predictable lifetime); new_verbose_flag, default to false, will cause every new/delete to output trace messages when set to true; new_autocheck_flag, default to true (which will cause the program to call check_leaks automatically on exit), will make users have to call check_leaks manually when set to false.

One thing to notice is that it might be impossible to ensure that the destruction of static objects occur before the automatic check_leaks call, since the call itself is issued from the destructor of a static object in debug_new.cpp. I have used several techniques to better the case. For MSVC, it is quite straightforword: “#pragma init_seg(lib)” is used to adjust the order of object construction/destruction. For other compilers without such a compiler directive, I use a counter class as proposed by Bjarne ([Stroustrup1997], section 21.5.2) and can ensure check_leaks will be automatically called after the destruction of all objects defined in translation units that include debug_new.h. For static objects defined in C++ libraries instead of the user code, there is a last resort: new_verbose_flag will be set to true after the automatic check_leaks call, so that all later delete operations along with number of bytes still allocated will be printed. Even if there is a misreport on memory leakage, we can manually confirm that no memory leakage happens if the later deletes finally report that “0 bytes still allocated”.

Debug_new will report on deleteing an invalid pointer (or a pointer twice), as well as on mismatches of new/delete[] or new[]/delete. A diagnostic message will be printed and the program will abort.

Exception safety and thread safety are worth their separate sections. Please read on.

Exception in the constructor

Let’s look at the following simple program:

#include <stdexcept>
#include <stdio.h>

void* operator new(size_t size, int line)
{
    printf("Allocate %u bytes on line %d\n", size, line);
    return operator new(size);
}

class Obj {
public:
    Obj(int n);
private:
    int _n;
};

Obj::Obj(int n) : _n(n)
{
    if (n == 0) {
        throw std::runtime_error("0 not allowed");
    }
}

int main()
{
    try {
        Obj* p = new(__LINE__) Obj(0);
        delete p;
    } catch (const std::runtime_error& e) {
        printf("Exception: %s\n", e.what());
    }
}

Any problems seen? In fact, if we compile it with MSVC, the warning message already tells us what has happened:

test.cpp(27) : warning C4291: 'void *__cdecl operator new(unsigned int,int)' : no matching operator delete found; memory will not be freed if initialization throws an exception

Try compiling and linking debug_new.cpp also. The result is as follows:

Allocate 4 bytes on line 27
Exception: 0 not allowed
Leaked object at 00341008 (size 4, <Unknown>)

There is a memory leak!

Of course, this might not be a frequently encountered case. However, who can ensure that the constructors one uses never throw an exception? And the solution is not complicated; it just asks for a compiler that conforms well to the C++ standard and allows the definition of a placement deallocation function ([C++1998], section 5.3.4; drafts of the standard might be found on the Web, such as here). Of compilers I have tested, GCC (2.95.3 or higher) and MSVC (6.0 or higher) support this feature quite well, while Borland C++ Compiler 5.5.1 and Digital Mars C++ compiler (all versions up to 8.38) do not. In the example above, if the compiler supports, we should declare and implement an “operator delete(void*, int)” to recycle the memory allocated by new(__LINE__); if the compiler does not, macros need to be used to make the compiler ignore the relevant declarations and implementations. To make debug_new compile under such a non-conformant compiler, users need to define the macro HAS_PLACEMENT_DELETE (Update: The macro name is HAVE_PLACEMENT_DELETE from Nvwa version 0.8) to 0, and take care of the exception-in-constructor problem themselves. I wish you did not have to do this, since in that case your compiler is really out of date!

Thread safety

My original version of debug_new was not thread-safe. There were no synchronization primitives in the standard C++ language, and I was unwilling to rely on a bulky third-party library. At last I decided to write my own thread-transparency layer, and the current debug_new relies on it. This layer is thin and simple, and its interface is as follows:

class fast_mutex
{
public:
    void lock();
    void unlock();
};

It supports POSIX threads and Win32 threads currently, as well as a no-threads mode. Unlike Loki ([Alexandrescu2001]) and some other libraries, threading mode is not to be specified in the code, but detected from the environment. It will automatically switch on multi-threading when the -MT/-MD option of MSVC, the -mthreads option of MinGW GCC, or the -pthread option of GCC under POSIX environments, is used. One advantage of the current implementation is that the construction and destruction of a static object using a static fast_mutex not yet constructed or already destroyed are allowed to work (with lock/unlock operations ignored), and there are re-entry checks for lock/unlock operations when the preprocessing symbol _DEBUG is defined.

Directly calling lock/unlock is error-prone, and I generally use an RAII (resource acquisition is initialization; [Stroustrup1997], section 14.4.1) helper class. The code is short and I list it here in full:

class fast_mutex_autolock
{
    fast_mutex& _M_mtx;
public:
    explicit fast_mutex_autolock(fast_mutex& __mtx) : _M_mtx(__mtx)
    {
        _M_mtx.lock();
    }
    ~fast_mutex_autolock()
    {
        _M_mtx.unlock();
    }
private:
    fast_mutex_autolock(const fast_mutex_autolock&);
    fast_mutex_autolock& operator=(const fast_mutex_autolock&);
};

I am quite satisfied with this implementation and its application in the current debug_new.

Special improvement with gcc/binutils

Using macros has intrinsic problems: it cannot work directly with placement new, for it is not possible to expand an expression like “new(special) MyObj” to record file/line information without prior knowledge of the “special” stuff. What is more, the definition of per-class operator new will not work since the preprocessed code will be like “void* operator new("some_file.cpp", 123)(size_t ...)” — the compiler will not love this.

The alternative is to store the instruction address of the caller of operator new, and look up for the source line if a leak is found. Obviously, there are two things to do:

There is no portable way to achieve these, but the necessary support has already been there for ready use if the GNU toolchain is used. Let’s just look at some GNU documentation:

`__builtin_return_address (LEVEL)'
     This function returns the return address of the current function,
     or of one of its callers.  The LEVEL argument is number of frames
     to scan up the call stack.  A value of `0' yields the return
     address of the current function, a value of `1' yields the return
     address of the caller of the current function, and so forth.

     The LEVEL argument must be a constant integer.

     On some machines it may be impossible to determine the return
     address of any function other than the current one; in such cases,
     or when the top of the stack has been reached, this function will
     return `0'.
(gcc info page)
addr2line
*********

     addr2line [ -b BFDNAME | --target=BFDNAME ]
               [ -C | --demangle[=STYLE ]
               [ -e FILENAME | --exe=FILENAME ]
               [ -f | --functions ] [ -s | --basename ]
               [ -H | --help ] [ -V | --version ]
               [ addr addr ... ]

   `addr2line' translates program addresses into file names and line
numbers.  Given an address and an executable, it uses the debugging
information in the executable to figure out which file name and line
number are associated with a given address.

   The executable to use is specified with the `-e' option.  The
default is the file `a.out'.
(binutils info page)

So the implementation is quite straightforward and like this:

void* operator new(size_t size) throw(std::bad_alloc)
{
    return operator new(size, __builtin_return_address(0), 0);
}

When a leak is found, debug_new will try to convert the stored caller address to the source position by popening an addr2line process, and display it if something useful is returned (it should be the case if debugging symbols are present); otherwise the stored address is displayed. One thing to notice is that one must tell debug_new the path/name of the process to make addr2line work. I have outlined the ways in the doxygen documentation.

If you have your own routines to get and display the caller address, it is also easy to make debug_new work with it. You may check the source code for details. Look for _DEBUG_NEW_CALLER_ADDRESS and print_position_from_addr.

Important update in 2007

With an idea coming from Greg Herlihy’s post in comp.lang.c++.moderated, a better solution is implemented. Instead of defining new to “new(__FILE__, __LINE__)”, it is now defined to “__debug_new_recorder(__FILE__, __LINE__) ->* new”. The most significant result is that placement new can be used with debug_new now! Full support for new(std::nothrow) is provided, with its null-returning error semantics (by default). Other forms (like “new(buffer) Obj”) will probably result in a run-time warning, but not compile-time or run-time errors — in order to achieve that, magic number signatures are added to detect memory corruption in the free store. Memory corruption will be checked on freeing the pointers and checking the leaks, and a new function check_mem_corruption is added for your on-demand use in debugging. You may also want to define _DEBUG_NEW_TAILCHECK to something like 4 for past-end memory corruption check, which is off by default to ensure performance is not affected.

The code was heavily refactored during the modifications. I was quite satisfied with the new code, and I released Nvwa 0.8 as a result.

Summary

So I have presented my small memory leakage detector. I’ll make a summary here, and you can also consult the online doxygen documentation for the respective descriptions of the functions, variables, and macros.

This implementation is relatively simple. It is lacking in features when compared with commercial applications, like Rational Purify, or even some open-source libraries. However, it is

With the recent improvements, some of the old restrictions are gone. The macro new or DEBUG_NEW in debug_new.h can mostly work if the newed object has operator news as class member functions, or if new(std::nothrow) is used in the code, though the macro new must be turned off when defining any operator news. Even in the worst case, linking only debug_new.cpp should always work, as long as the allocation operation finally goes to the global operator new(size_t) or operator new(size_t, std::nothrow_t).

Source is available, for your programming pleasure, in the CVS (most up to date) or download of Stones of Nvwa.

May the Source be with you!

Bibliography

[Alexandrescu2001]  Andrei Alexandrescu. Modern C++ Design: Generic Programming and Design Patterns Applied. Addison-Wesley, 2001.
[C++1998]  ISO/IEC. Programming Languages — C++. Reference Number ISO/IEC 14882:1998(E), 1998.
[Stroustrup1997]  Bjarne Stroustrup. The C++ Programming Language (Third Edition). Addison-Wesley, 1997.

HTML for code syntax highlighting is generated by Vim

Note: Although I would love to, I did not succeed in making this page conform to the HTML 4.01 specification. I guess W3C is to blame. Why should <font color=...> be forbidden inside a <pre> block (Vim currently generate HTML code this way), while the clumsier <span style="color: ..."> is allowed? However, this is fixable after all (I have already written a converter indeed). There is worse to come: <nobr> is not a valid tag. When a major browser could break a line after a - and no mechanisms are provided by the standard to achieve the no-breaking effect, using something like <nobr> is inevitable (and I can name other cases where <nobr> is needed). I cannot fix the browser, so I have to choose to break the standard. Detailed online information about this problem can be found here.

2004-3, Chinese version first published here at IBM developerWorks China
2004-11-28, rewritten in English (at last) by Wu Yongwei
2007-12-31, last updated by Wu Yongwei


This work is licensed under a Creative Commons Attribution-Share Alike 2.5 Licence.

Return to Main