Sun Java Solaris Communities My SDN Account

Article

Using Dmalloc With the Solaris OS and Sun Studio Compilers

 
By Greg Nakhimovsky, April 2007  

Abstract

This article describes my experience with Dmalloc, a useful open source debugging package, on the Solaris Operating System with Sun Studio compilers.

Contents

Introduction

Locating application bugs related to memory access (for example, memory overwrites) is one of the most difficult and labor-intensive parts of C/C++ programming. C and C++ give us a lot of power and performance, but at the price of having to deal with every detail of memory allocation (among other things).

In addition to crashes and other obvious problems caused by the memory-related bugs, these bugs may be dormant, causing big problems in the field at unpredictable times. They can also create security vulnerabilities resulting from buffer overflows in the heap and so on.

Dmalloc is an open source Debug Malloc package. All of its source code and documentation is available at the Dmalloc web site (see reference [1]).

Many similar, relatively low-tech debugging tools are available, but I like Dmalloc for its effectiveness, relative simplicity, and the availability of all its implementation details, including the source code, which is easy enough to adjust when necessary.

Over the years, I've used Dmalloc a number of times. It has helped me find some hard-to-catch memory bugs, both in applications and in system code, such as X/Motif libraries and OpenGL graphics pipelines.

Dmalloc was created and is still maintained by Gray Watson. It has been around for quite a long time, apparently since 1992. I myself used it in 1997 for the first time.

Not surprisingly, Dmalloc has grown over the years. However, it's still relatively simple, easy to use, and most importantly (at least in my opinion), it's an effective debugging tool, especially for large applications where the other tools may not be available or may not work properly due to scalability and other problems.

The main reason for this Solaris OS-specific article on Dmalloc is that historically Dmalloc has been geared mostly toward platforms and tools other than the Solaris OS, so at the moment, it requires a few special tweaks and configurations to work well with the Solaris OS. I would also like to share some experience I've had using Dmalloc on Solaris systems.

I've used Dmalloc with the Solaris 10 OS for both SPARC based and x64 systems (it works very similarly on both) with Sun Studio 11 compilers.

Why Do We Need Dmalloc? Don't We Have Professional Tools for This?

Indeed, many tools are available to help us find memory-related errors at runtime. However, none of them is perfect, so adding Dmalloc to your debugging toolbox may well be worth the effort.

One at a time, I'll consider some of the alternative tools available for UNIX systems, and briefly discuss how useful they have been in the Solaris/Sun Studio environment (in my personal experience). I'll start with the Sun tools, and then consider a few others.

All these tools are well known and can be easily found on the Web and elsewhere, so I won't provide any references for them here.

dbx Run-Time Checking (RTC)

dbx RTC is a part of the dbx debugger, which is a part of the Sun Studio compiler/tool suite. dbx RTC has many advantages compared to other tools and many useful features. When and where it works, it works quite well. I've used it many times to debug small- and medium-size programs, with considerable success.

At the moment, dbx RTC memory access checking is available only on systems running the Solaris OS for SPARC platforms (not x86/x64 platforms), although this should change in the next version of Sun Studio software.

dbx RTC does have one disadvantage: It may not scale for very large applications. This is because it instruments every application and system binary involved (executables and shared libraries) in memory on the fly in dbx, before starting to run the application. For large applications, it may run out of memory, or it may take too long to initialize.

The libumem Package

libumem(3LIB) is an optional malloc replacement package, which is a part of the Solaris OS (starting with the Solaris 9 OS). It also has a memory debugging mode. See umem_debug(3MALLOC). The debugging capabilities of libumem include "redzone" sections to check for memory overwrites and filling the allocated and freed memory segments with special patterns to help detect the use of uninitialized data.

However, a few of my very unscientific experiments with umem_debug(3MALLOC) have not shown much success in finding memory access bugs with the applications I've been working with, while Dmalloc has found such bugs.

Also, using umem_debug(3MALLOC) requires the use of mdb(1), a "modular debugger" that is mostly intended to be used by kernel programmers. This makes using umem_debug(3MALLOC) less convenient for application programmers who are more used to symbolic debuggers, such as dbx.

Purify

Purify is a commercial tool from IBM (formerly from Rational Software, before that, from Atria Software, and before that, from Pure Software). It is one of the oldest and most powerful runtime memory debugging tools. However, I found that it has many disadvantages when it comes to debugging large Solaris applications:

  • This tool requires a special Purify version for each Solaris platform, for each Solaris kernel update, and for each new compiler version.
  • Purify requires relinking the application binary to instrument the code for Purify use.
  • The resulting specially instrumented binaries can be run only at the location where they were built.
  • Running the instrumented binaries can be extremely slow. I've seen some application runs that normally take a few minutes take all day to run when instrumented for Purify.
  • This tool produces many false-positives making it hard to separate the real memory bugs from the rest.
  • The tool does not work with VIS instructions (on SPARC platforms), making it impossible to use with applications that call mediaLib functions, for example.
  • The tool is commercial (and quite expensive). Not only does it cost money (which is a barrier in itself in this age of open source and other "free" software), but also having to deal with Purify licensing adds significant complexities to the debugging process.
Valgrind

Valgrind is a very powerful open source tool for runtime memory checking. The tool is not perfect but it's very useful (and fast) where it is available (primarily for Linux/x86 systems).

Unfortunately, Valgrind can't be used with the Solaris OS, even for x86/x64 platforms (not to mention SPARC platforms). Valgrind has many details of the x86 architecture (it emulates each x86 instruction), the Linux kernel, GLIBC, and the GNU C compiler hardwired, so porting it to the Solaris OS is very difficult.

Others

There are many more tools available for runtime memory debugging, both commercial and open source: Electric Fence, GNU Checker, Insure++, Mpatrol, Etnus MemoryScape, and more. I have not used any of these other tools myself, so I can't comment on them.

One list of such tools is available in the Related Software section of the Mpatrol web site.

Dmalloc Limitations

Dmalloc is a malloc-replacement package. It has limitations, as all such packages do. In particular:

  • Dmalloc can detect only memory-related problems in the heap, not in the stack, and not in static memory.
  • It can detect bugs only when the memory is allocated with malloc(), not by other means, such as sbrk() or mmap().
  • It cannot detect the following problems, which can be detected by more complicated tools, such as dbx RTC, Purify, and Valgrind, that do much more than replace malloc, realloc, and free:
    • Checking stack memory
    • Reading from or writing to unallocated memory
    • Reading from allocated but uninitialized memory
    • Writing to read-only memory
Building Dmalloc for the Solaris OS

Depending on your requirements, Dmalloc can be used in many ways. It can be used as a static library or as a shared library. You can optionally use #include dmalloc.h and then use Dmalloc's additional features. You may or may not need the "multithreaded version," the "C++ version," and so on. Also, Dmalloc provides many configuration options that can be specified only in the C header files, thus requiring a rebuild.

Therefore, providing the binaries for all cases is not practical, and it's necessary to build Dmalloc in place to satisfy your specific needs.

Although the Solaris OS is listed on Dmalloc's web page among the platforms on which Dmalloc has been built and run successfully, in reality, building Dmalloc under the Solaris OS while using the Sun Studio compiler is far from trivial.

For one thing, Dmalloc uses the ./configure facility, which is a part of the GNU Autoconf suite frequently used to build open source programs on various platforms. The problem is that GNU Autoconf was designed for GNU compilers, tools, and operating systems, such as Linux or FreeBSD, certainly not for the Solaris OS. It has many hardwired dependencies on those operating system features and compilers.

Fixing GNU Autoconf and its ./configure scripts for the Solaris OS and Sun Studio compilers, and then fixing the way these tools are used by Dmalloc, are large tasks beyond the scope of this article. I had neither the time nor patience to do this the hard way. Therefore, I'll describe an ad hoc "solution" I used recently, while working on a memory-related bug in a very large application on a SPARC system running the Solaris 10 OS.

Here are the steps I used in this particular case.

1. Download and install the Dmalloc package.

I used the latest version, 5.5.0, which was released in February 2007. To unzip the package, use commands such as this:

% cd where_you_want_to_install_it
% gunzip -c /tmp/dmalloc-5.5.0.tgz | tar xvf -

This will create the subdirectory where_you_want_to_install_it/dmalloc-5.5.0 and place all Dmalloc files there.

For installation instructions, I generally used the instructions provided in How to Install the Library on the Dmalloc web site.

2. Modify file settings.dist.

For my case, I changed the following:

ALLOW_FREE_NULL_MESSAGE=0
FENCE_TOP_SIZE=16
FENCE_BOTTOM_SIZE=ALLOCATION_ALIGNMENT*2
LOG_REOPEN=0

The details of these settings are described in the Dmalloc docs, but the most important ones are FENCE_TOP_SIZE=16 and FENCE_BOTTOM_SIZE=ALLOCATION_ALIGNMENT*2, which set the "picket-fence" (also known as "redzone") areas to 16 bytes on each side of each malloc block allocation. During the run, Dmalloc will check the integrity of these areas and detect whether they have been overwritten.

3. Run configure.

In this case, I didn't need any special C++ facilities. I did need a "threaded" version (meaning Dmalloc should use mutex locking in its special malloc, realloc, and free routines), and I needed a shared-library version so that I could use LD_PRELOAD_64. I also needed a 64-bit version of the library, so I could use it with 64-bit applications.

By default, configure assumes GCC as the compiler. To use the Sun Studio compiler, I set the CC environment variable, and I used the following commands.

% setenv CC "/opt/SUNWspro/bin/cc -KPIC -errfmt=error -Xc -xarch=v9"
% ./configure --disable-cxx --enable-threads --enable-shlib

It took some trial and error to get the compiler flags correct. Particularly, -Xc turned out to be necessary because Dmalloc does certain things correctly only if the compiler is ANSI-C compliant, which is accomplished with the -Xc flag with the Sun Studio compilers. The -KPIC setting is to generate position-independent code intended for a shared library.

The -xarch=v9 setting is for 64-bit Solaris applications for SPARC systems. For 64-bit Solaris x64 applications, I used -xtarget=opteron -xarch=amd64 instead.

Unfortunately, configure ran into a number of problems. Since it hasn't been taught how to handle the Sun Studio compiler, a number of its tests failed, and program conftest (both 32-bit and 64-bit versions of it) crashed, as I learned soon enough because all my machines have system-wide AppCrash (see references [2] and [3]) installed. So, I got emails with AppCrash output similar to this:

Output from runme_on_app_crash
Program: conftest
Process ID: 8389
Received signal: 6

Application Debugging Data
--------------------------

>> /bin/pstack 8389

8389:	./conftest
 ffffffff7f2ce8c4 _lwp_kill (6, 0, ffffffffffffffff, \
   ffffffff7f3e6000, 0, 0) + 8
 ffffffff7f248bcc abort (1, 1b8, ffffffff7f2ba83c, \
   19d540, 0, 0) + 118
 0000000100000de0 main (1, ffffffff7ffff1f8, ffffffff7ffff208, \
   ffffffff7f249408, ffffffff7f1000c0, ffffffff7f100100) + 28
 0000000100000a9c _start (0, 0, 0, 0, 0, 0) + 17c

...

>> /bin/ptree 8389
...
801   /bin/csh
 7086  /bin/bash ./configure --disable-cxx --enable-threads --enab
   8388  /bin/bash ./configure --disable-cxx --enable-threads --en
     8389  ./conftest
...

On the other hand, it appears that the crash in abort() may be intentional, a part of the tests performed by the configure script. I may have noticed this crash only because AppCrash detected it.

In any case, configure produced the required files: Makefile, conf.h, and settings.h. I examined them for sanity, as recommended in the Dmalloc document.

4. Run make.

This produced a few compiler warnings (seemingly harmless) and eventually the shared library I needed called libdmallocth.so. However, that library turned out to be wrong. It was 32-bit (not 64-bit as I needed):

% file libdmallocth.so
libdmallocth.so:  ELF 32-bit MSB dynamic lib SPARC Version 1,
dynamically linked, not stripped, no debugging
information available

Also, it had no function definitions at all:

% nm libdmallocth.so | grep FUNC
%

An examination of the make output showed the way the library was built:

ar cr libdmallocth.a arg_check.o compat.o dmalloc_rand.o \
  dmalloc_tab.o env.o heap.o chunk_th.o error_th.o malloc_th.o
ranlib libdmallocth.a
rm -f libdmallocth.so libdmallocth.so.t
ld -G -o libdmallocth.so.t libdmallocth.a # arg_check.o \
  compat.o dmalloc_rand.o dmalloc_tab.o env.o heap.o chunk_th.o \
  error_th.o malloc_th.o
mv libdmallocth.so.t libdmallocth.so

This is all wrong. For one thing, the ranlib(1) command is from the ancient SunOS 4.x OS. There is no need to create an archive library first in this case at all.

To correct it, I simply ran the cc command to do what I wanted:

% /opt/SUNWspro/bin/cc -xarch=v9 -G -o libdmallocth.so \
  arg_check.o compat.o dmalloc_rand.o dmalloc_tab.o env.o heap.o \
  chunk_th.o error_th.o malloc_th.o
% file libdmallocth.so
libdmallocth.so:   ELF 64-bit MSB dynamic lib SPARCV9 Version 1, \
  dynamically linked, not stripped
% nm libdmallocth.so | grep FUNC
[747]   |  61928|  140|FUNC |GLOB |0   |7  |_dmalloc_address_
break
[751]   |  45936|  184|FUNC |GLOB |0   |7  |_dmalloc_atoi
...

Later, to make this easier, I created a script I called rebuild containing the make command followed by the cc command above, and ran rebuild whenever a Dmalloc rebuild was necessary.

5. Run a test program.

The test described in the Dmalloc documentation ("make light") didn't work for me.

On the Solaris 10 OS for SPARC platforms, the resulting dmalloc_t program attempted to consume all the memory available on the system and then got hung. I had to kill the dmalloc_t process.

On the Solaris 10 OS for x64 platforms, the test program dmalloc_t produced the following error messages:

  % ./dmalloc_t -s -t 10000
  ERROR: Running special tests failed.  Last dmalloc error: no
error (err 1)
  Random seed is 1173381022. Final dmalloc error: no error (err 1)

Running dmalloc_t without the -s flag (in the non-silent mode) has produced the following error messages:

  ERROR: index overload failed
  ERROR: index overload failed

I'm not sure what all those error messages and conditions mean. Most likely, they indicate problems with the test program dmalloc_t.c. I've decided not to debug it any further.

Using the Dmalloc Shared Library

To use the Dmalloc shared library, I performed the following steps.

1. Set the DMALLOC_OPTIONS environment variable.

Dmalloc has a lot of options. Including all of them in one command in a legible form would be impossible. Instead, Dmalloc has a utility program called dmalloc. There are many ways to use dmalloc. See the detailed information in the Description of the Debugging Tokens section of the Dmalloc web site.

For example, here's how I used it in this case.

In the dmalloc-5.5.0 directory, I created a file called .dmallocrc and copied the sample file dmallocrc into it. Then I modified .dmallocrc and created a section called greg that contained the Dmalloc options I wanted:

greg   log-bad-space, check-fence, check-heap, \
       check-funcs, print-messages, error-dump, \
       realloc-copy, check-blank

Then I ran dmalloc greg and got the following output (ignoring the various "feature has been disabled" warnings that don't seem to make much sense):

setenv DMALLOC_OPTIONS debug=0x42106d00

In addition to the hexadecimal debug value, I also needed a few more options, so my final setting of DMALLOC_OPTIONS became:

setenv DMALLOC_OPTIONS debug=0x42106d00,inter=1000,log=logfile.%p

In this command, inter=1000 means that I want the integrity of the heap checked every 1000th call to malloc, realloc, or free, as opposed to the default value of inter=1, meaning check the entire heap each time. The log=logfile.%p setting means create a log file called logfile.pid, where pid is replaced with the process ID.

2. Set the LD_PRELOAD_64 environment variable to the Dmalloc shared library, for example:

setenv LD_PRELOAD_64 /export/home/dmalloc-5.5.0/libdmallocth.so

3. Run your application.

Make sure the application is using a dynamically linked malloc package, such as the standard libc malloc(3C). Preloading will not work if malloc, realloc, and free are linked in statically.

Fixing a Bug Using Dmalloc

Let us consider a simple example. In this case, Dmalloc has detected two problems in the system libraries, specifically, in the OpenGL pipeline for the XVR-2500 graphics card installed in an Ultra 45 workstation. The problems themselves are almost trivial, but they are quite typical.

1. Memory overwrite:

1168279430: 27537:   pointer '0x10d0bf910' from 'unknown' \
  prev access 'unknown'
1168279430: 27537:   dump of proper fence-top bytes: \
  '\372\312\336i\372\312\336i\372\312\336i\372\312\336i'
1168279430: 27537:   dump of '0x10d0bf910'+3: \
  'v/fb\000\312\336i\372\312\336i\372\312\336i\372\312\336i'
1168279430: 27537:   next pointer '0x10d0bf940' (size 8) may \
  have run under from 'unknown'
1168279430: 27538: ERROR: _dmalloc_chunk_heap_check: failed \
  OVER picket-fence magic-number check (err 27)

The stack trace (as reported by AppCrash) was:

...
ffffffff7ee17470 dmalloc_error (ffffffff7ee1b3f0, \
  ffffffff7fffac48, 0, ffffffff76bf2044, 0, 0) + 140
ffffffff7ee11a7c log_error_info (0, 0, 0, 10d0b0a98, \
  ffffffff7ee1b3d8, ffffffff7ee1b3f0) + 3c
ffffffff7ee13ff0 _dmalloc_chunk_heap_check (0, ffffffff7f72fd5c, \
  ff000000, ff000000, 11e6f0, ffffffff7f72c448) + 5f8
ffffffff7ee1806c dmalloc_in (0, 0, 1, ffffffff7f61f068, 11e6a0, \
  ffffffff71602000) + 3ec
ffffffff7ee18370 dmalloc_malloc (0, 0, 20, a, 0, 0) + 40
ffffffff7ee18cd0 malloc (20, ffffffff6bacc238, 0, \
  ffffffff6b609538, 11e6f0, ffffffff7f72c448) + 28
ffffffff78b055d0 XextAddDisplay (ffffffff6c685190, 10d864010, \
  ffffffff6c67d078, ffffffff6c648728, 0, 0) + 20
ffffffff6b609538 ogl_kfb_XF86DRI_glx_QueryDirectRenderingCapable \
  (10d864010, 10d090e10, ffffffff7fffb854, 2238, ffffffff6c648718, \
  ffffffff7f61f068) + 78
ffffffff6b607964 ogl_kfb_XF86DRIQueryDirectRenderingCapable \
  (10d864010, ffffffff6b6094c0, ffffffff7fffb854, ffffffff6b5f6910, \
  4b93dc, 0) + 24
ffffffff6b5eed20 __driCreateScreen (10d864010, 0, 10d090e68, \
  b, 10d88f010, 0) + 20
ffffffff6b5f6960 ogl_kfb_create_screen (10ce5c810, 8000, \
  10d0bf3d0, 10daa7900, 1, 10d88f010) + 758
ffffffff79ab7080 __glxcLoadInitModule (10d091210, 0, 0, \
  230, 0, ffffffff79d19a40) + 640
ffffffff79ab71d4 cglxdCreateContext (10d864010, 10daa7690, 0, \
  10daaa010, 8014, 10ce5c810) + 94
ffffffff79af6e2c __glXCreateNewContext (10d864010, 10daa7690, \
  8014, 1, 0, 0) + 5ec
ffffffff7f1532b8 _glXCreateNewContext (10d864010, 10daa7690, 8014, \
  0, 1, ffffffff7f30b7b8) + 254
ffffffff7f131c9c glXCreateContext (10d873010, 10daa9810, 0, 1, \
  1, ffffffff7fffc190) + d94
...

According to the Dmalloc documentation:

27 (ERROR_OVER_FENCE)
  This indicates that a pointer had its upper bound
picket-fence magic space overwritten. If the 'check-fence'
token is enabled, the library writes magic values above and
below allocations to protect against overflow. ...

I'll describe more details about this error below.

2. Double free():

1168468189: 27858: ERROR: free: cannot locate pointer in \
  heap (err 22)
1168468194: 27858:   error details: finding address in heap
1168468194: 27858:   pointer '0x10d0c6c10' from 'unknown' prev \
  access 'unknown'

To get more information, I changed the code that prints these messages (dmalloc_error() routine in error.c file) to invoke pstack(1) instead of fork(). This modification is explained in more detail below.

The following stack trace was produced:

8977: /<path_to_executable>
-----------------  lwp# 1 / thread# 1  --------------------
  ffffffff76ccebb8 waitid   (0, 2316, ffffffff7fffae90, 3)
  ffffffff76cc0cec waitpid (2316, ffffffff7fffb110, 0, 0, \
    ffffffff6f713a40, 0) + 64
  ffffffff76cb44ac system (ffffffff7fffb2b8, 1988, 1800, 0, \
    ffffffff76de6000, ffffffff7fffb178) + 394
  ffffffff7ef172d8 dmalloc_error (ffffffff7ef1b3c8, \
    ffffffff76df0fc0, 0, 1000, 0, 0) + 140
  ffffffff7ef11a04 log_error_info (0, 0, 10d0c6c10, 0, \
    ffffffff7ef1b3b0, ffffffff7ef1b3c8) + 3c
  ffffffff7ef14ca4 _dmalloc_chunk_free (0, 0, 10d0c6c10, 11, \
    0, 0) + 1dc
  ffffffff7ef18910 dmalloc_free (0, 0, 10d0c6c10, 11, 4cd000, \
    30000) + 108
  ffffffff7ef18ed0 free (10d0c6c10, 10d0c6b10, 1, 10d0c6b10, \
    0, 10d0c6c10) + 20
  ffffffff6b5f4854 ogl_kfb_destroy_drawable (10db79890, \
    10d080e10, 10d0c6c10, 7800, 2, ffffffff6b5ee008) + 58
  ffffffff79ca84c4 cglxdDestroyGlxDrawable (10ce5c810, 0, 0, \
    10d0c6d10, ffffffff6b5f48f0, 0) + 324
  ffffffff79cafec0 __glXDestroyPbuffer (10d8c3010, b00010, \
    400, ffffffff7f214b54, 22f100, 420) + 40
  ffffffff7f214b54 glXDestroyPbuffer (10d8c3010, b00010, 0, \
    ffffffff7f3e5de0, ffffffff7f3e9ad3, ffffffff7f3e9a70) + 7c4
  ffffffff7f272a78 __1cHpbuffer2T5B6M_v_ (10d961780, 0, \
    ffffffff7f3e9a70, ffffffff7f3e9ace, ffffffff7f3e9ba8, \
    ffffffff7f3e5de0) + 480
  ffffffff7f2735bc __1cFpbwin2T5B6M_v_ (10d774af0, \
    10d774af8, 10d961780, ffffffff7f3e5de0, 0, 0) + 2c
  ffffffff7f26b4cc __1cHwinhashGdetach6MpnP__winhashstruct__v_ \
    (10d774af0, 10cf1e7e8, 0, 1000, 0, 6) + 34
  ffffffff7f264654 __1cI_winhashGremove6MpcLb_v_ (10ce744c0, \
    10d0af2d0, 5400002, 0, ffffffff7f3e5de0, 0) + 264
  ffffffff7f22d588 XDestroyWindow (10d8c2010, 5400002, \
    10ce744c0, ffffffff7f3eac68, ffffffff7f3e5de0, 0) + 7d8
...

It turned out there was a duplicate free() memory error in this code.

Such problems can be further debugged using a debugger such as Sun Studio dbx or its IDE. You can set a breakpoint in dmalloc_error() and examine the function arguments, the contents of the heap, and so on. Of course, this will be much easier if the application is compiled debuggable and its source code is available.

Both of these bugs were fixed in the next release of the Sun OpenGL patch.

How Dmalloc Can Be Improved

Here are a few Dmalloc improvements I can think of.

Make the Dmalloc Implementation for the Solaris OS Recognize Sun Studio Compilers

This should include Sun Studio compiler flags for 64-bit (for both SPARC and x86 platforms) and ANSI-C. Replace the ld command for linking with cc or CC, and so on.

The problems with ./configure described above don't seem to be related to the tools autoconf, automake, or libtool themselves so much, but rather to the way those tools are being used in Dmalloc.

Since Dmalloc needs to create a shared library (at least as an option), it could have been using libtool. However, the existing configure.ac file doesn't invoke the AC_PROG_LIBTOOL macro as it could. Instead, it seems to have its own macros for defining how to build shared libraries, and these are wrong for the Solaris OS and Sun Studio compilers. With the Sun Studio compilers, it should use the cc (or CC for C++) compiler driver to link the shared library, so that it is consistent about generating 64-bit or 32-bit libraries. Instead it is invoking ld directly, and it isn't setting -64 when a 64-bit library is desired.

There is no file Makefile.am either, so apparently it's not using automake. Instead, it is providing a manually written Makefile.in file.

It would be desirable for the project to move to using libtool. At least on the Solaris OS, libtool-1.5.22 or later should work correctly when building shared libraries with Sun compilers.

Add an Option to Build a 64-bit Version of the Dmalloc Shared Library

Modern applications are as likely to be 64-bit as 32-bit. Users need to have an easy way to choose between the two.

Fix mmap() Problems

I've run into the situation where Dmalloc started to use mmap(2) instead of malloc for the entire application. To work around this problem, I commented out the #if HAVE_MMAP && USE_MMAP section in Dmalloc source file heap.c:

% diff heap.c.orig heap.c
97c97
< #if HAVE_MMAP && USE_MMAP
---
> #if HAVE_MMAP && USE_MMAP && 0

Improve Readability of the Dump of Overwritten Memory, and Improve the Documentation Explaining What That Dump Contains

Currently, the dumped data is mostly written in octal format. I think printing it in hexadecimal format would make it easier for the user to interpret.

Using the first example described above, the memory overwrite error message was shown as:

1168279430: 27537:   pointer '0x10d0bf910' from 'unknown' \
  prev access 'unknown'
1168279430: 27537:   dump of proper fence-top bytes: \
  '\372\312\336i\372\312\336i\372\312\336i\372\312\336i'
1168279430: 27537:   dump of '0x10d0bf910'+3: \
  'v/fb\000\312\336i\372\312\336i\372\312\336i\372\312\336i'
1168279430: 27537:   next pointer '0x10d0bf940' (size 8) \
  may have run under from 'unknown'
1168279430: 27538: ERROR: _dmalloc_chunk_heap_check: failed \
  OVER picket-fence magic-number check (err 27)

The "proper fence-top bytes" are initialized to 0xFACADE69 (as defined in Dmalloc file chunk_loc.h), four times in this case, since I configured Dmalloc to have 16-byte fences (also known as redzones), both for "bottom" and "top." Using the octal representation for characters that can't be displayed as ASCII, this indeed translates to:

\372\312\336i\372\312\336i\372\312\336i\372\312\336i

The overwritten buffer is:

\000\312\336i\372\312\336i\372\312\336i\372\312\336i

However, this is not obvious from the Dmalloc message. It took me a while to realize it, and I had to look into what Dmalloc does exactly. From file chunk.c:

/*
 * The size includes the bottom fence post area.  We want it to
 * align with the start of the top fence post area.
 */
if (DUMP_SPACE > user_size + FENCE_OVERHEAD_SIZE) {
  dump_size = user_size + FENCE_OVERHEAD_SIZE;
  offset = -FENCE_BOTTOM_SIZE;
}
else {
  dump_size = DUMP_SPACE;
/* we will go backwards possibly up to FENCE_BOTTOM_SIZE offset */
  offset = user_size + FENCE_TOP_SIZE - DUMP_SPACE;
}
...
dump_pnt = (char *)start_user + offset;
if (IS_IN_HEAP(dump_pnt)) {
  out_len = expand_chars(dump_pnt, dump_size, out, sizeof(out));
  dmalloc_message("  dump of '%#lx'%+d: '%.*s'",
	  (unsigned long)start_user, offset, out_len, out);
}

In this case, the values are as follows.

DUMP_SPACE = 20
user_size = 7 (length of "/dev/fb")
FENCE_OVERHEAD_SIZE = 16
offset = user_size + FENCE_TOP_SIZE - DUMP_SPACE = 3

So, this malloc() was called requesting 7 bytes, but then 8 bytes were written into that buffer, including the trailing zero: /dev/fb\000. In this example, it was the result of OpenGL code like this (where devPath is a character string containing /dev/fb):

char *ptr;
...
ptr = malloc(strlen(devPath));
strcpy(ptr, devPath);

The author of this code forgot that strcpy() adds a trailing zero at the end of the copied string. This is a rather common error in C/C++. The correct way to call malloc() in the situation above is:

ptr = malloc(strlen(devPath)+1);

To make it easier for the user to deal with errors like this, I think it's important to print more debugging information, such as the size of the current allocation (user_size) and the DUMP_SPACE value. Perhaps printing the damaged "fence" alone, in addition to what's printed now, would also help.

Add Checking for Calls to memcpy() With Overlapping Memory Regions

The latest version of Dmalloc (5.5.0) is supposed to have this check added already, but I haven't been able to make it work at all, at least on a Solaris machine with the Sun Studio compilers, even when I included dmalloc.h in my test program.

This issue is not directly related to malloc, but the additional check is useful and it's easy to do. A while ago, someone at the Dmalloc forum suggested implementing it. Valgrind performs this check.

The problem is that memcpy() is sometimes used when the source and destination memory buffers overlap. This is a bug, at least in the Solaris OS. It can damage the memory buffers. It is safe to use memmmove() in that case, usually at the price of less efficiency.

For now, I've created a special library interposer to perform this check separately from Dmalloc. See reference [4] for more information about library interposers. Here is this library interposer source code:

% cat memcpy_interp.c
/*
 * Interpose on memcpy() and check for overlapping memory
 * segments, like Valgrind.
 * By Greg Nakhimovsky, Sun Microsystems.
 * January 2007.
 *
 * Build and use this interposer as following
 * (assuming 64-bit application on Solaris/SPARC):
 * cc -g -errfmt=error -xarch=v9 -o memcpy_interp.so -G
 * -Kpic memcpy_interp.c
 * setenv LD_PRELOAD_64 /path/memcpy_interp.so
 * run the app
 * unsetenv LD_PRELOAD_64
 */

#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
#include <unistd.h>
#include <string.h>

void *memcpy(void *restrict s1,  const  void  *restrict  s2,
     size_t n)
{
  static void * (*func)(void *restrict s1,  const  void
    *restrict  s2, size_t n);
  static char buffer[64];
  char *cs1 = s1;
  char *cs2 = s2;
  int x;

  if(!func)
  {
    func = (void *(*)()) dlsym(RTLD_NEXT, "memcpy");
    sprintf(buffer,"LD_PRELOAD_64= /bin/pstack %ld\n", getpid());
  }

  x = cs2 - cs1;
  if(x < 0) x = - x;
  if(x < n)
  {
     printf("\nmemcpy() called with overlapping segments:\n
       s1=0x%p s2=0x%p n=%d\n", s1, s2, n);
     system(buffer);
  }

  return func(s1,s2,n);
}
%

Interestingly, when I ran the application in question with this library interposer (but without Dmalloc), it detected an inefficiency in X/Motif routine GetResources(). I got a lot of output similar to the following.

memcpy() called with overlapping segments:
  s1=0x1253f0788 s2=0x1253f0788 n=8

19436: /path/app.exe
-----------------  lwp# 1 / thread# 1  --------------------
  ffffffff76dcebb8 waitid   (0, 4f3e, ffffffff7fff7d30, 3)
  ffffffff76dc0cec waitpid (4f3e, ffffffff7fff7fb0, 0, 0, \
    ffffffff6f818480, 0) + 64
  ffffffff76db44ac system (ffffffff7f300960, 1988, 1800, 0, \
    ffffffff76ee6000, ffffffff7fff8018) + 394
  ffffffff7f200580 memcpy (1253f0788, 1253f0788, 8, ffffff67, \
    5, 1253f0788) + f0
  ffffffff7a11f220 GetResources (1253f06f0, 1253f06f0, \
    ffffffffffffff67, 0, ffffffff7a264db0, 28) + e4c
  ffffffff7a11df38 _XtGetResources (146f8c, ffffffff7fffa650, \
    4, 0, ffffffff7fffa39c, ffffffff7fff9d50) + 120
  ffffffff7a11c644 xtCreate (1253f06f0, 0, 10ae0a468, \
    1254c4780, 10cf04f20, ffffffff7fffa650) + 154
  ffffffff7a125bc0 _XtCreateWidget (10db11a78, 10ae0a468, \
    1254c4780, ffffffff7fffa650, 4, 0) + 278
  ffffffff7a125918 XtCreateWidget (10db11a78, 10ae0a468, \
    1254c4780, ffffffff7fffa650, 4, 1) + d0
  ...

Note that GetResources() is calling memcpy() to copy 8 bytes from a given address to itself!

I've reported this inefficiency to Sun's X/Motif developers.

Change error-dump Functionality From fork(2) to pstack(1), at Least for the Solaris OS

Currently, error-dump results in Dmalloc calling fork(), attempting to dump core, and continuing. In my tests, this has caused recursive behavior, leading to a bad crash. Also, since I have AppCrash installed on all my Solaris 10 and later machines, this generated endless AppCrash reports.

Instead, generating a stack trace telling us where in the program the error occurred would be much more useful. For this purpose, I replaced the fork() code in the Dmalloc dmalloc_error() routine (error.c file) with this:

  char buf[128];
  sprintf(buf, "LD_PRELOAD_64= /bin/pstack %d", (int)getpid());
  system(buf);

The "LD_PRELOAD_64= " command is to prevent recursive preloading of the Dmalloc library for pstack(1).

Dmalloc attempts to determine the "return address" of the caller of malloc() and other functions. See the GET_RET_ADDR() macro in the return.h file. However, that code is obsolete and ineffective (at least for the Solaris and Sun Studio platforms). This is why Dmalloc prints unknown in error messages such as this:

pointer '0x10d0bf910' from 'unknown' prev access 'unknown'

The pstack(1) technology is much more reliable than the assembly-level hacks in return.h.

If you include dmalloc.h and recompile your application, Dmalloc may be able to obtain the caller's address from the Dmalloc functions. This wasn't practical in my case, so I didn't test this feature.

Conclusion

Dmalloc is a valuable debugging tool for C, C++, and Fortran developers, supplementing other available debugging technologies. I've found it especially useful for large applications that the more powerful tools can't handle well.

With a few relatively minor adjustments, Dmalloc can become even more useful, particularly for the developers of Solaris applications using Sun Studio compilers and tools.

References

Acknowledgements

I'd like to thank Gray Watson for creating Dmalloc and for improving and maintaining it all these years. Also, thanks to my Sun colleague Richard Smith for his comments and advice regarding the use of the GNU Autoconf tools.

About the Author

Greg Nakhimovsky is a Sun engineer working with application software vendors to make sure their products run well on Sun systems.

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.

Oracle is reviewing the Sun product roadmap and will provide guidance to customers in accordance with Oracle's standard product communication policies. Any resulting features and timing of release of such features as determined by Oracle's review of roadmaps, are at the sole discretion of Oracle. All product roadmap information, whether communicated by Sun Microsystems or by Oracle, does not represent a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. It is intended for information purposes only, and may not be incorporated into any contract.