Debugging at the machine-instruction level in the dbx command-line debugger environment becomes very handy when a software bug cannot be found easily. Usually, programs are written in high-level languages such as C, C++, or Fortran, and most of the software defects can be debugged in the dbx environment at the same high level. However, having some knowledge of the machine-instruction level of the system on which the program is running, and using the right tool, such as dbx, can shorten the time to identify the culprit and come up with the optimal solution to fix the defect. This article describes how to use the dbx debugger efficiently on the AMD64 architecture. It describes how to display the contents of memory at specified addresses, and how to display machine instructions. Use the regs command to print out the contents of machine registers or the print command to print out individual registers. Use the nexti, stepi, stopi, and tracei commands to debug at AMD64 machine-instruction level. The AMD64 architectureFirst let's review briefly the AMD64 architecture and see how it is different from the 32-bit x86 architecture. I describe only the materials that are relevant to this article. For an in-depth understanding of AMD64 architecture, please refer to AMD64 manuals (http://developer.amd.com) and AMD64 Application Binary Interface (ABI) (http://www.x86-64.org). The AMD64 architecture has sixteen 64-bit general purpose registers (GPRs): RAX, RBX, RCX, RDX, RBP, RSI, RDI, RSP, R8, R9, R10, R11, R12, R13, R14, and R15. Compared to the x86 architecture, the AMD64 architecture has eight new GPRs. The RAX, RBX, RCX, RDX, RBP, RSI, RDI, and RSP registers are used by both 32-bit and 64-bit binaries. However, in 32-bit mode, only the low 32 bits of these registers are accessible by 32-bit binaries. In the x86 architecture, these registers are EAX, EBX, ECX, EDX, EBP, ESI, EDI, and ESP.
Table 1: General Purpose Registers The AMD64 architecture provides sixteen 128-bit XMM registers. Registers XMM0 through XMM7 are used for passing float and double parameters. The long double type is passed in memory. A long double in AMD64 architecture is 16 bytes long compared to 12 bytes in the x86 architecture. The long double type is implemented based on 80-bit extended (IEEE) standard.
Table 2: Media Registers The AMD64 architecture also provides eight x87 floating point registers, each 80 bits wide.
Table 3: Media / Floating-Point Registers In contrast to the 32-bit architecture in which the function
parameters
are passed on the stack, the 64-bit architecture has six registers
available for integer parameter passing. If the number of integer
parameters is more than six, the remaining parameters are passed
on the stack. The bool, char, short, int, long, long long, and pointer types are classified as integer class. For passing parameters of the integer class, the next available register of the sequence RDI, RSI, RDX, RCX, R8, and R9 is used. Registers RBP, RBX, and R12 through R15 belong to the calling function, and the called function is required to preserve their values. The RIP register is the instruction pointer register. In 64-bit mode, the RIP register is extended to 64 bits to support 64-bit offsets. In 32-bit x86 architecture, the instruction pointer register is the EIP register. The return value of a function is classified based on the rules that are specified in AMD64 ABI. For instance, if the return value needs to be passed in memory, then the caller provides space for the return value and passes the address of this storage in the RDI register as if it were the first argument to the function. On return, the RAX register contains the address that has been passed by the caller in the RDI register. Similarly, if the return type is integer, the next available register of the sequence RAX, RDX is used. In addition to registers, each function has a frame on the run-time stack. The run-time stack grows downwards from a high address. Table 4 shows the stack organization.
Table 4: Stack Frame With Base Pointer The RSP register is the stack pointer register and the RBP register is the frame pointer register. Stack operations make implicit use of the RSP register, and in some cases, the RBP register. The RSP register is decremented when items are pushed onto the stack, and incremented when they are popped off the stack. The RBP register points to the lowest address of the data structure that is passed from one function to another. The 128-byte area beyond the location pointed to by the RSP register is known as red zone and is considered to be reserved. Functions can use this area for temporary data that is not needed across function calls. In particular, leaf functions can use this area for their entire stack frames, rather than adjusting the stack pointer in the prologue (see Example 1) and the epilogue (see Example 2).
Example 1: Function Prologue There is no need to adjust the RSP stack pointer register if the red zone area is used. In other words, the subq $48,%rsp instruction is not needed in function prologue if the red zone area is used.
Example 2: Function Epilogue The C++ language has its own Application Binary Interface (ABI). The C++
ABI has well-defined rules for function parameter passing and return
values. The C++ ABI rules supplement the AMD64 ABI rules; the C++
compiler has to use the C++ ABI rules for function parameter passing in
addition to the AMD64 ABI rules. dbx CommandsThe following commands are documented in Debugging a Program With dbx manual ( http://docs.sun.com/doc/819-3683) for machine-instruction level debugging.
examine [ address ] [ / [ count
] [format ] ]
stepi
nexti
listi
tracei
stopi
dis
print expression, ...
regs [-f] [-F]
The Problem StatementTo demonstrate machine-instruction level debugging, let's use a real bug report that was filed against the 64-bit dbx on the AMD64 platform, including a testcase.
On AMD64 dbx prints hex
values instead of letters after strchr call: Here is the testcase: main() {
There is nothing wrong with the program. The bug is in dbx. The dbx FailureFirst let's observe the normal flow of the program in the dbx environment by just stepping through the testcase code. % dbx a.out The print statement at line 5 calls the strchr function with two parameters. The strchr function searches through the first parameter hello and returns a pointer to the first occurrence of the l character. Hence, the llo character string is displayed correctly by the printf statement. Now let's reproduce the failure by calling the strchr function directly from the dbx command line using the print command. The call command in dbx can also be used to call the strchr function from the command line. % dbx a.out dbx prints incorrect output when the strchr function is called by the print command. dbx should display the llo string instead of hex characters, since the call to the strchr function is supposed to return a pointer to the first occurrence of the l character in the string hello. The Debugging SessionLet's run the debugger with the a.out executable and stop right before the printf statement. The strchr function is defined in libc library and most likely is not compiled with the -g option. So there is no debugging information and we have to rely on the assembly code only. The stopi command is used to set a breakpoint at the first machine instruction of the strchr function. % dbx a.out dbx stops at the first instruction of the strchr function after the strchr function is called from the dbx command line using the print command. The dis command can be used to display the first portion of machine instructions for the strchr function. (dbx) dis strchr 0xfffffd7fff307910: strchr : movb (%rdi),%dl 0xfffffd7fff307912: strchr+0x0002: cmpb %dh,%dl The first instruction of the strchr function is movb (%rdi),%dl, which moves the contents of the memory location pointed to by the %rdi register to the low eight bits of the %rdi register itself. The first instruction is not the pushq %rbp instruction, which means the strchr function has no prologue. It is not a defect that the function does not have a prologue. The debugger is stopped at the first instruction, which is the right place in the program to verify whether the input parameters are being passed correctly to the strchr function. The strchr function has two parameters. The first parameter is a pointer to the memory location that contains the hello character string and the second parameter is the character l. Based on the AMD64 ABI, the first and second parameters are assigned to the %rdi and %rsi registers in sequence. There are two ways to display the content of the %rdi and %rsi registers.
The %rdi register contains a pointer to the memory location 0xfffffd7fffdff740, which is allocated on the stack. In the normal program flow, the %rdi register contains a pointer to the memory location in the data segment. However, when dbx is asked to call a function (strchr), dbx copies the memory location in the data segment onto the stack and passes the stack address to the %rdi register. The contents of the memory location 0xfffffd7fffdff740 can be verified by using the examine command. The memory location should contain the hello character string. (dbx) examine 0xfffffd7fffdff740 / 2 By looking up the ASCII table, we can verify that indeed the memory location 0xfffffd7fffdff740 contains the hello character string. The hex number 68 stands for the character h, 65 stands for the character e, 6c stands for the character l, and 6f stands for the character o. You can use the examine command directly to display the contents of the memory location 0xfffffd7fffdff740 as a character string without referring to the ASCII table (dbx) examine 0xfffffd7fffdff740 / 6c The %rsi register contains the hex number 6c, which stands for the l character. The other two important registers are the %rsp (the stack pointer) and %rbp (the frame pointer). The %rsp register is pointing to the top of the stack and its value is 0xfffffd7fffdff738. As you can see, this value is very close to the contents of the %rdi register, which is pointing to the memory location on the stack that contains the hello character string. The %rbp register is the frame pointer and contains 0xfffffd7fffdff81 value. The %rbp register is not used in the strchr function. The contents of the run-time stack can be displayed using the examine command. (dbx) examine 0xfffffd7fffdff738 / 32 lx 0xfffffd7fffdff738: 0xfffffd7fff220004 0x0000006f6c6c6568 In fact, we can unwind the run-time stack by following the principles that we learned in the previous section (see Table 4) about the stack frame with the base pointer. For instance, the hex number 0x40080c is the address of next instruction after the callq instruction. The main function is called from the _start function using the callq instruction. The hex number 0x40080c is the return address that is pushed onto the stack before the call to the main function. The instruction at address 0x40080c, push %rax, will be executed upon the completion of the main function. In other words, the address 0x40080c will be loaded into the program counter, the %rip register, once the main function returns. You can use the objdump utility program to dump the text section of an executable. objdump -S a.out 00000000004007a0 <_start>: 4007a0: 6a 00 pushq $0x0 4007a2: 6a 00 pushq $0x0 4007a4: 48 8b ec mov %rsp,%rbp 4007a7: 48 8b fa mov %rdx,%rdi 4007aa: 48 c7 c0 80 0a 41 00 mov $0x410a80,%rax ... 400806: 59 pop %rcx 400807: e8 54 01 00 00 callq 400960 <main> 40080c: 50 push %rax 40080d: 50 push %rax ... The first instruction of the main function is push %rbp. Hence, the previous frame pointer (0xfffffd7fffdff820) is pushed onto the stack right after the return address. Similarly, the return address (0x40099d) is pushed onto the stack when the strchr function is called from the command line. 0000000000400960 <main>: 400960: 55 push 400961: 48 8b mov %rsp,%rbp 400964: 48 83 ec 40 sub $0x40,%rsp ... 40099d: b8 6c 00 00 00 mov $0x6c,%eax 4009a2: 0f be f0 movsbl %al,%esi 4009a5: 48 c7 c7 68 0c 41 00 mov $0x410c68,%rdi 4009ac: b8 00 00 00 00 mov $0x0,%eax However, the strchr function does not have a function prologue, so the content of %rbp register stays the same when the strchr function is called from the main function. The content of %rbp register is the hex value 0xfffffd7fffdff810 and in turn the content of the 0xfffffd7fffdff810 address points to the previous frame pointer 0xfffffd7fffdff820. (dbx) examine 0xfffffd7fffdff810 Going forward, we single step through the machine instructions using the nexti command until we get to the instruction that returns the return value in the %rax register. We can use the dis command to display the last portion of machine instructions for the strchr function. (dbx) dis Based on the description of the strchr function, at the end it is supposed to return a pointer to the first occurrence of the l character in the string hello. We can verify the correctness of the strchr function by examining the contents of the %rax register. (dbx) examine $rax / 4cIndeed, the value of the %rax register is a pointer to the memory location 0xfffffd7fffdff742, which is allocated on the stack and contains the llo character string. We have verified that the strchr function works correctly and returns a pointer to the llo character string in the %rax register. So the problem must be with what dbx does internally after it finishes calling the strchr function. Fast forward, after calling a user function, dbx always calls the fflush function to flush the output stream. The fflush function takes one parameter, which is a pointer to the FILE data structure. fflush - flush a stream You can use the dis command to display the machine instructions for the fflush function. (dbx) dis fflush Let's go over the fflush function prologue: pushq %rbp Store the previous frame pointer on the stack. movq %rsp, %rbp Store the value of the %rsp register or the previous stack pointer into the %rbp register. This value is the new frame pointer for the fflush function. movq %rbx,0xfffffffffffffff0(%rbp) The %rbx register and the %r12 register are callee-saved registers. The fflush function must preserve the contents of these registers on the stack for the caller function so they can be restored later in the function epilogue just before exiting the function. sub $0x0000000000000010,%rsp Adjust the stack pointer for the fflush function. The stopi command is used to stop at the first instruction of fflush function. (dbx) stopi at fflush dbx stops at the first instruction of the fflush function after the cont command is entered on the dbx command line. Let's display the %rdi, %rbp, and %rsp registers. The %rdi register contains a pointer to the FILE data structure.(dbx) print -flx $rdi We step through the function prologue and print the %rsp and %rbp registers again. (dbx) stepi If you can recall from previous section, the run-time stack grows downwards from high address. By careful examination of the %rsp register and comparing its value (0xfffffd7fffdff730) with the last value of the %rsp register (0xfffffd7fffdff738) in the strchr function, it becomes obvious that the space that is allocated on the stack for the fflush function overlaps with the space for the strchr function. The 0xfffffd7fffdff738 value is right between the value of the %rbp register (0xfffffd7fffdff740) and the value of the %rsp register (0xfffffd7fffdff730) of fflush function. Therefore, the fflush function overwrites the contents of the run-time stack for the strchr function, which explains why the print strchr("hello", 'l') command displays garbage instead of the llo character string. The fix for the dbx debugger is to preserve the contents of the run-time stack just before the call to the fflush function and restore it just before returning to the print command. ConclusionIn general, low-level debugging requires the user to have some kind of knowledge about the system on which the program is executing. But once necessary knowledge is learned, even the most difficult bugs can be detected using the low-level debugging techniques and using the right tool, such as dbx. You can learn more about the x86 assembly language by referring to the article Assembly Language Techniques for the Solaris OS, x86 Platform Edition at http://developers.sun.com/solaris/articles/x86_assembly_lang.html. Nasser Nouri is a staff software engineer currently working in the dbx debugger engineering group. For the last 9 years at Sun, Nasser has worked on wide spectrum of projects, such as the Massively Parallel Hardware Verilog Simulation system, the Distributed Verilog Simulation over the Internet using Load Balancing software and Java Servlet technology, and Java Graphical User Interfaces for CAD tools. Before joining Sun, he worked on Logic, Fault, and VHDL hardware simulation systems. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||