Exploring Heap-Based Buffer Overflows with the Application Verifier
Isolating the root cause of a heap-based buffer overflow can be tricky at best. Thankfully, Microsoft provides a great tool called the Application verifier, which makes the process significantly gentler.
In this post, we will look at how to use the Application Verifier to pinpoint the source of a heap overflow in a binary. Due to the fact that it is difficult to find a publicly available and easy-to-trigger heap overflow vulnerability in an application whose EULA does not prevent reverse engineering, I have created a small sample application that contains a heap overflow for this purpose.
The sample application (contactsheap) simply parses a custom “contact” file (.ct) and displays it neatly. This trivial file format was designed for the specific purpose of this post and is not (to my knowledge) used anywhere.
The output below shows a sample run of the application on a contact file (phil.ct).
As you can see, the contact file in question contains the details for someone called Phil, age 35 from Austin, TX.
If we use the “xxd” utility (available from http://unxutils.sourceforge.net) to dump the contact file in a readable fashion we can already see that the format is quite readable.
For the sake of this post, however, let’s pretend that we have run a fuzzer against phil.ct and triggered a crash when it is opened with contactsheap.exe. To investigate this crash we can begin by running the application within the cdb debugger. This debugger is part of the Debugging Tools for Windows package, and is basically the command line version of Windbg. The Debugging Tools for Windows package is available at Microsoft.com.
As you can see, this application is crashing on a read instruction reading from the address 0x41414141 (“AAAA”). If we use the “k” command to determine the stack back trace for the running application, we can see that this occurred after ExitProcess() was called. From this we can probably make a guess that the heap was smashed with the repeating “A” character; however, we don’t know the reason why at this stage. This is where the Application Verifier helps us out.
The Application Verifier (appverif.exe) is a utility created by Microsoft to aid with the investigation of a variety of software bugs. It is available as a small download from the Microsoft website. It provides a variety of options for monitoring different aspects of an application at runtime. However, in order to limit the scope of this post, we will focus on the heap debugging functionality.
The options for heap debugging present in the Application Verifier are a combination of the gflags and pageheap functionality accessible through one convenient user interface. Essentially, this functionality allows us to force an application to use the pageheap allocator instead of the default memory allocator on Windows. The pageheap allocator will allocate a full page per requested chunk. It then makes sure that the page after the allocation in memory is unmapped. This works as a guard page, and basically means that if memory access takes place outside the boundaries of the allocated chunk it will result in an instant access violation at the exact time of access.
In order to begin using this functionality we start by running the appverif.exe application, (typically C:WINDOWSSYSTEM32APPVERIF.EXE), right clicking on the Applications field and selecting “Add Application”.
We can then browse to our contactsheap.exe application and click OK. This adds contactsheap.exe to our Application textbox. The Tests field on the right-hand side of the window allows us to select the various run-time tests we wish to enable for our application. For the sake of this post though, we’re only interested in the “Basics -> Heaps” tests. If we right-click on these and select Properties we can fine tune our heap debugging options.
As you can see from the screenshot below, there are a variety of options for configuring our heap tests.
We will run through a few of the relevant options we need to debug the contactsheap vulnerability mentioned above. However, the rest of the options are explained in detail in the help file that ships with the Application Verifier.
The first option on our list, “Full”, toggles between the usage of “normal page heap” or “full page heap”. Full page heap is what was described above with an unmapped guard page after each allocation. For obvious reasons, this is a very slow process, and can cause some applications to be completely unresponsive. In contrast, Normal page heap simply uses “cookie” values before and after each allocated chunk. When a chunk is HeapFree()’ed or HeapAlloc()’ed, the integrity of the current heap is checked. This is clearly much less overhead than using the full pageheap method, although it will not be as accurate.
Another option available if Full page heap is required but the overhead is too great is to specify a size range using the size fields shown above. These fields let you select a range of chunk sizes in which to use the page heap. The rest of the allocations will be allocated using the normal allocator. This results in a faster solution, but has the downside that the approximate size of the chunk you’re overflowing must be known prior to debugging.
The Windows memory allocator has been designed in such a way that a different “front end allocator” can be used in different situations. On Windows XP the default was to use a Look-aside list as the front end allocator; however, on Windows Vista and later the default is now to use the Low Fragmentation Heap (LFH). The option UseLFHGuardPages, shown at the bottom of the panel above, causes guard pages to be inserted in the case that the LFH front-end allocator is being used. This is turned on since I’m using Windows 7 for this test.
Once we have selected our options, we can click OK and then Save to apply our settings. This will create registry entries for the Application with the settings so that they will be applied whenever the application is invoked. Now we are ready to once again run contactheap.exe under the cdb debugger.
As you can see, once again we have a crash accessing an unmapped memory address. However, this time rather than it being a memory read instruction to 0x41414141 we have a crash on a write instruction, moving one byte into the location 06e42000. The fact that our address is page aligned (0x1000) indicates already that we are probably accessing the start of one of our heap guard pages. If we use the “r” command to dump the contents of the eax register we can see that it contains the value 41. We can therefore assume that this instruction is smashing the heap with “A”s.
If we once again use the ‘k’ command to dump the call stack for our application we can get a clearer picture of what is going on.
Luckily for us, we have debug symbols for this binary. But even if we didn’t, the MSVCR and kernel32 functions would still be named correctly. As you can see after the c-runtime finished, the wmain() function was executed. From here, the read_record() function was called. This function called sprintf(), which is a known unsafe function, as it performs no bounds checking when it copies a string.
If we load the binary (contactsheap.exe) up in IDA Pro and jump to the read_record() function we can very clearly see the call to sprint().
Here we can see that the sprintf() call used the format string “%s %s”, so it was concatenating two strings together. But before we can completely understand this vulnerability we must first track down where the destination string was allocated. In some cases, this can be quite difficult, but again the Application Verifier makes our job much easier.
When the pageheap functionality is enabled for an application, each memory allocation has its callstack logged at the time of allocation. This functionality makes it trivial to discover where an allocation took place at the time of crash.
This information is easily accessible from within windbg. First, however, we can look at how it’s stored. When an allocation takes place, pageheap populates a _DPH_BLOCK_INFORMATION structure and stores it directly before the chunk itself. The format of this structure is as follows:
As you can see, this structure is a treasure trove of information for us to use in further investigating our vulnerability. We can see what size was requested by the program, as well as what size was actually allocated after rounding takes place. We can recognize these structures in memory by the Startstamp and EndStamp values in memory. Startstamp is always initialized to the static value 0xabcdaaaa and EndStamp is initialized to 0xdcbaaaaa.
In order to locate the _DPH_BLOCK_INFORMATION structure for our particular crash, we can use the !heap windbg extension. The –x command will report information about a particular address. If we pass it the current value of edx minus four it will report the starting address of our structure.
We can then use this address with the “dt” (dump type) command to display the bytes at this address in the form of our _DPH_BLOCK_INFORMATION structure.
From this information we can see that a 0x15 (21) byte allocation was requested. This was rounded to 0x35 (53) during the allocation process. We can also see that the stack trace information is stored at the address 0x04bcf79c.
To dump the stack trace in a readable fashion we can use the dds command. This command means “dump dwords with symbols”, and shows where each address is located.
The most interesting entries in this backtrace for us are those in the contactsheap module itself. We can see that the function directly before the call to RtlAllocateHeap took place was the “read_record” function. This means that the allocation took place in this function. To get some context on this we can use the “ub” (unassembled backwards) command in cdb to dump the previous 5 instructions before the call to HeapAlloc.
In order to do further investigation on this we will need to move to static analysis in IDA Pro. Before we go into this, however, I will just mention that exploring the pageheap metadata can also be done using the ‘!heap’ extension. To view the options for this, as well as information on the technique described above, you can use the ‘!heap –p -?’ command.
If we browse the section of the binary where our allocation takes place in IDA pro, we can see each argument to HeapAlloc() labeled with its name.
We can see from this listing that the number of bytes allocated by HeapAlloc came from the edx register. Also you may notice that the size is a result of the calculation of ecx + eax + 5. It seems logical that this instruction might be responsible for an integer overflow, as there is no bounds checking performed on the values of eax and ecx prior to this being executed.
The final step in our exploration is to work out where the values of the variables var_50 and var_75 came from in order to determine the exact criteria that lead to our heap overflow condition. We can do this by investigating the cross references (places in the binary where the variable is used) for each variable in turn. To start this we can click on the var_50 variable and press the “X” key. This brings up a list of the x-refs for the variable.
Next we select each x-ref in turn and investigate them. Looking at the first x-ref we can see that the result of a function called “ReadString” is stored in it. We can see from the function prototype that the string takes two arguments, an integer and a (void **).
Since we know that the value we’re looking at is definitely an integer (we know this because it’s used as the number of bytes to allocate with HeapAlloc), we can make a guess that it’s probably the length/number of bytes read by the ReadString function. We can investigate this by reversing the function though. For conciseness sake, however, we can assume that this is true (since I wrote the vulnerable application, I’m pretty sure it’s a safe bet). Readstring reads a length from the filehandle provided. It then reads that many bytes from the file and stores it in a string. The length that was read in first is then returned by the function.
Looking at the second variable, var_74, we can see that it is used in exactly the same way, as a size value from ReadString. With this in mind, we can get a high-level overview of the vulnerability. Two length-encoded strings are read in. Their lengths are added together and the result is used to decide how many bytes to allocate. Then the strings are sprintf()’ed into the buffer. However, due to a wrap-around condition when calculating the length to allocate, the copy can go well out of the bounds of the allocated buffer.
With this information in mind we can begin the process of fixing (or exploiting) the vulnerability in question. Hopefully, if you’ve read up to here you’ve learned something from all this. If anyone is interested in receiving a copy of the binary mentioned in this post, just email firstname.lastname@example.org and let me know.
For those of you analyzing bugs under platforms other than Windows, similar functionality can be achieved using Valgrind on Linux/BSD/Mac OS X. Also on Mac OS X, a custom pageheap implementation is shipped by default that can be preloaded. This can be used with DYLD_INSERT_LIBRARY=libgmalloc.dylib.