Exploring Heap-Based Buffer Overflows with the Application Verifier

March 29, 2010 - 1 Comment

Isolating the root cause of a heap-based buffer overflow can be tricky at best. Thankfully, Microsoft provides a great tool called the Application verifier, which makes the process significantly gentler.

In this post, we will look at how to use the Application Verifier to pinpoint the source of a heap overflow in a binary. Due to the fact that it is difficult to find a publicly available and easy-to-trigger heap overflow vulnerability in an application whose EULA does not prevent reverse engineering, I have created a small sample application that contains a heap overflow for this purpose.

The sample application (contactsheap) simply parses a custom “contact” file (.ct) and displays it neatly. This trivial file format was designed for the specific purpose of this post and is not (to my knowledge) used anywhere.

The output below shows a sample run of the application on a contact file (phil.ct).

 C:UsersuserDesktopcontactsheapcontactsheapDebug&gt;contactsheap.exe phil.ct<br /> -----[ contactsheap ]-------<br /> 2010 Cisco Systems<br /> ----------------------------<br /> [+] Contact:<br /> Name:           Mr Phil Dangerfield<br /> Age:            35<br /> Location:       Austin, TX<br />

As you can see, the contact file in question contains the details for someone called Phil, age 35 from Austin, TX.

If we use the “xxd” utility (available from http://unxutils.sourceforge.net) to dump the contact file in a readable fashion we can already see that the format is quite readable.

 C:UsersuserDesktopcontactsheapcontactsheapDebug&gt;xxd phil.ct<br /> 0000000: 1100 0000 5068 696c 2044 616e 6765 7266  ....Phil Dangerf<br /> 0000010: 6965 6c64 0003 0000 0033 3500 0b00 0000  ield.....35.....<br /> 0000020: 4175 7374 696e 2c20 5458 0003 0000 004d  Austin, TX.....M<br /> 0000030: 7200 0d0a                                r....<br />

For the sake of this post, however, let’s pretend that we have run a fuzzer against phil.ct and triggered a crash when it is opened with contactsheap.exe. To investigate this crash we can begin by running the application within the cdb debugger. This debugger is part of the Debugging Tools for Windows package, and is basically the command line version of Windbg. The Debugging Tools for Windows package is available at Microsoft.com.

 C:UsersuserDesktopcontactsheapcontactsheapDebug&gt;cdb contactsheap.exe bad.ct<br /> Microsoft (R) Windows Debugger Version 6.11.0001.404 AMD64<br /> Copyright (c) Microsoft Corporation. All rights reserved.<br /> CommandLine: contactsheap.exe bad.ct<br /> 0:000:x86&gt; g<br /> -----[ contactsheap ]-------<br /> 2010 Cisco Systems<br /> ----------------------------<br /> [+] Contact: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[additional text removed]<br /> Phil Dangerfield<br /> Age:            35<br /> Location:       Austin, TX<br /> (23a0.1c4): Access violation - code c0000005 (first chance)<br /> First chance exceptions are reported before any exception handling.<br /> This exception may be expected and handled.<br /> ntdll32!RtlImageNtHeader+0x92f:<br /> 77163913 8b12            mov     edx,dword ptr [edx]  ds:002b:41414141=????????<br /> 0:000:x86&gt;<br />

As you can see, this application is crashing on a read instruction reading from the address 0x41414141 (“AAAA”). If we use the “k” command to determine the stack back trace for the running application, we can see that this occurred after ExitProcess() was called. From this we can probably make a guess that the heap was smashed with the repeating “A” character; however, we don’t know the reason why at this stage. This is where the Application Verifier helps us out.

The Application Verifier (appverif.exe) is a utility created by Microsoft to aid with the investigation of a variety of software bugs. It is available as a small download from the Microsoft website. It provides a variety of options for monitoring different aspects of an application at runtime. However, in order to limit the scope of this post, we will focus on the heap debugging functionality.

The options for heap debugging present in the Application Verifier are a combination of the gflags and pageheap functionality accessible through one convenient user interface. Essentially, this functionality allows us to force an application to use the pageheap allocator instead of the default memory allocator on Windows. The pageheap allocator will allocate a full page per requested chunk. It then makes sure that the page after the allocation in memory is unmapped. This works as a guard page, and basically means that if memory access takes place outside the boundaries of the allocated chunk it will result in an instant access violation at the exact time of access.

In order to begin using this functionality we start by running the appverif.exe application, (typically C:WINDOWSSYSTEM32APPVERIF.EXE), right clicking on the Applications field and selecting “Add Application”.

We can then browse to our contactsheap.exe application and click OK. This adds contactsheap.exe to our Application textbox. The Tests field on the right-hand side of the window allows us to select the various run-time tests we wish to enable for our application. For the sake of this post though, we’re only interested in the “Basics -> Heaps” tests. If we right-click on these and select Properties we can fine tune our heap debugging options.

As you can see from the screenshot below, there are a variety of options for configuring our heap tests.

We will run through a few of the relevant options we need to debug the contactsheap vulnerability mentioned above. However, the rest of the options are explained in detail in the help file that ships with the Application Verifier.

The first option on our list, “Full”, toggles between the usage of “normal page heap” or “full page heap”. Full page heap is what was described above with an unmapped guard page after each allocation. For obvious reasons, this is a very slow process, and can cause some applications to be completely unresponsive. In contrast, Normal page heap simply uses “cookie” values before and after each allocated chunk. When a chunk is HeapFree()’ed or HeapAlloc()’ed, the integrity of the current heap is checked. This is clearly much less overhead than using the full pageheap method, although it will not be as accurate.

Another option available if Full page heap is required but the overhead is too great is to specify a size range using the size fields shown above. These fields let you select a range of chunk sizes in which to use the page heap. The rest of the allocations will be allocated using the normal allocator. This results in a faster solution, but has the downside that the approximate size of the chunk you’re overflowing must be known prior to debugging.

The Windows memory allocator has been designed in such a way that a different “front end allocator” can be used in different situations. On Windows XP the default was to use a Look-aside list as the front end allocator; however, on Windows Vista and later the default is now to use the Low Fragmentation Heap (LFH). The option UseLFHGuardPages, shown at the bottom of the panel above, causes guard pages to be inserted in the case that the LFH front-end allocator is being used. This is turned on since I’m using Windows 7 for this test.

Once we have selected our options, we can click OK and then Save to apply our settings. This will create registry entries for the Application with the settings so that they will be applied whenever the application is invoked. Now we are ready to once again run contactheap.exe under the cdb debugger.

 0:000:x86&gt; g<br /> -----[ contactsheap ]-------<br /> 2010 Cisco Systems<br /> ----------------------------<br /> [+] Contact:<br /> (3b9c.3a28): Access violation - code c0000005 (first chance)<br /> First chance exceptions are reported before any exception handling.<br /> This exception may be expected and handled.<br /> MSVCR90D.dll -<br /> MSVCR90D!getc_nolock+0x13c9:<br /> 6944a189 8802 mov byte ptr [edx],al ds:002b: 06e42000 =??<br />

As you can see, once again we have a crash accessing an unmapped memory address. However, this time rather than it being a memory read instruction to 0x41414141 we have a crash on a write instruction, moving one byte into the location 06e42000. The fact that our address is page aligned (0x1000) indicates already that we are probably accessing the start of one of our heap guard pages. If we use the “r” command to dump the contents of the eax register we can see that it contains the value 41. We can therefore assume that this instruction is smashing the heap with “A”s.

 0:000:x86&gt; r al<br /> al=41<br />

If we once again use the ‘k’ command to dump the call stack for our application we can get a clearer picture of what is going on.

 0:000:x86&gt; k<br /> ChildEBP RetAddr<br /> WARNING: Stack unwind information not available. Following frames may be wrong.<br /> 0039f3a8 6944a283 MSVCR90D!getc_nolock+0x13c9<br /> 0039f3c0 69449fba MSVCR90D!getc_nolock+0x14c3<br /> 0039f6f0 6940ca94 MSVCR90D!getc_nolock+0x11fa<br /> 0039f740 011917f5 MSVCR90D!sprintf+0x114<br /> 0039f89c 01191a94 contactsheap!read_record+0x135<br /> 0039fa2c 01192148 contactsheap!wmain+0x144<br /> 0039fa7c 01191f8f contactsheap!__tmainCRTStartup+0x1a8<br /> 0039fa84 75213677 contactsheap!wmainCRTStartup+0xf<br /> 0039fa90 77169d72 kernel32!BaseThreadInitThunk+0x12<br /> 0039fad0 77169d45 ntdll32!RtlInitializeExceptionChain+0x63<br />

Luckily for us, we have debug symbols for this binary. But even if we didn’t, the MSVCR and kernel32 functions would still be named correctly. As you can see after the c-runtime finished, the wmain() function was executed. From here, the read_record() function was called. This function called sprintf(), which is a known unsafe function, as it performs no bounds checking when it copies a string.

If we load the binary (contactsheap.exe) up in IDA Pro and jump to the read_record() function we can very clearly see the call to sprint().

 mov esi, esp<br /> mov eax, [ebp+var_20]<br /> push eax<br /> mov ecx, [ebp+var_44]<br /> push ecx<br /> push offset aSS ; "%s %s"<br /> mov edx, [ebp+var_14]<br /> push edx ; char *<br /> call ds:__imp__sprintf<br /> add esp, 10h<br /> cmp esi, esp<br />

Here we can see that the sprintf() call used the format string “%s %s”, so it was concatenating two strings together. But before we can completely understand this vulnerability we must first track down where the destination string was allocated. In some cases, this can be quite difficult, but again the Application Verifier makes our job much easier.

When the pageheap functionality is enabled for an application, each memory allocation has its callstack logged at the time of allocation. This functionality makes it trivial to discover where an allocation took place at the time of crash.

This information is easily accessible from within windbg. First, however, we can look at how it’s stored. When an allocation takes place, pageheap populates a _DPH_BLOCK_INFORMATION structure and stores it directly before the chunk itself. The format of this structure is as follows:

 typedef struct _DPH_BLOCK_INFORMATION<br /> {<br /> ULONG StartStamp;<br /> PVOID Heap;<br /> ULONG RequestedSize;<br /> ULONG ActualSize;<br /> union<br /> {<br /> LIST_ENTRY FreeQueue;<br /> SINGLE_LIST_ENTRY FreePushList;<br /> WORD TraceIndex;<br /> };<br /> PVOID StackTrace;<br /> ULONG EndStamp;<br /> } DPH_BLOCK_INFORMATION, *PDPH_BLOCK_INFORMATION;<br />

As you can see, this structure is a treasure trove of information for us to use in further investigating our vulnerability. We can see what size was requested by the program, as well as what size was actually allocated after rounding takes place. We can recognize these structures in memory by the Startstamp and EndStamp values in memory. Startstamp is always initialized to the static value 0xabcdaaaa and EndStamp is initialized to 0xdcbaaaaa.

In order to locate the _DPH_BLOCK_INFORMATION structure for our particular crash, we can use the !heap windbg extension. The –x command will report information about a particular address. If we pass it the current value of edx minus four it will report the starting address of our structure.

 0:000&gt; !heap -x edx-4<br /> Entry User Heap Segment Size PrevSize Unused Flags<br /> 06e41fc0 06e41fc8 05d80000 05de1768 40 - b LFH;busy<br />

We can then use this address with the “dt” (dump type) command to display the bytes at this address in the form of our _DPH_BLOCK_INFORMATION structure.

 0:000&gt; dt _DPH_BLOCK_INFORMATION 6e41fc8<br /> verifier!_DPH_BLOCK_INFORMATION<br /> +0x000 StartStamp : 0xabcdaaaa<br /> +0x004 Heap : 0x85bb1000<br /> +0x008 RequestedSize : 0x15<br /> +0x00c ActualSize : 0x35<br /> +0x010 Internal : _DPH_BLOCK_INTERNAL_INFORMATION<br /> +0x018 StackTrace : 0x04bcf79c<br /> +0x01c EndStamp : 0xdcbaaaaa<br />

From this information we can see that a 0x15 (21) byte allocation was requested. This was rounded to 0x35 (53) during the allocation process. We can also see that the stack trace information is stored at the address 0x04bcf79c.

To dump the stack trace in a readable fashion we can use the dds command. This command means “dump dwords with symbols”, and shows where each address is located.

 0:000&gt; dds 04bcf79c<br /> 04bcf79c 00000000<br /> 04bcf7a0 00006001<br /> 04bcf7a4 000d0000<br /> 04bcf7a8 6655a6a7 verifier!AVrfpDphNormalHeapAllocate+0xd7<br /> 04bcf7ac 66558f6e verifier!AVrfDebugPageHeapAllocate+0x30e<br /> 04bcf7b0 772002fe ntdll!RtlDebugAllocateHeap+0x30<br /> 04bcf7b4 771bac4b ntdll!RtlpAllocateHeap+0xc4<br /> 04bcf7b8 77163b4e ntdll!RtlAllocateHeap+0x23a<br /> 04bcf7bc 665bfd2c vfbasics!AVrfpRtlAllocateHeap+0xb1<br /> 04bcf7c0 011917a4 contactsheap!read_record+0xe4<br /> 04bcf7c4 01191a94 contactsheap!wmain+0x144<br /> 04bcf7c8 01192148 contactsheap!__tmainCRTStartup+0x1a8<br /> 04bcf7cc 01191f8f contactsheap!wmainCRTStartup+0xf<br /> 04bcf7d0 75213677 kernel32!BaseThreadInitThunk+0xe<br /> 04bcf7d4 77169d72 ntdll!__RtlUserThreadStart+0x70<br /> 04bcf7d8 77169d45 ntdll!_RtlUserThreadStart+0x1b<br />

The most interesting entries in this backtrace for us are those in the contactsheap module itself. We can see that the function directly before the call to RtlAllocateHeap took place was the “read_record” function. This means that the allocation took place in this function. To get some context on this we can use the “ub” (unassembled backwards) command in cdb to dump the previous 5 instructions before the call to HeapAlloc.

 0:000&gt; ub 011917a4 L5<br /> contactsheap!read_record+0xd7:<br /> 01191797 52 push edx<br /> 01191798 6a08 push 8<br /> 0119179a 8b45f8 mov eax,dword ptr [ebp-8]<br /> 0119179d 50 push eax<br /> 0119179e ff15c0811901 call dword ptr [contactsheap!_imp__HeapAlloc (011981c0)]<br />

In order to do further investigation on this we will need to move to static analysis in IDA Pro. Before we go into this, however, I will just mention that exploring the pageheap metadata can also be done using the ‘!heap’ extension. To view the options for this, as well as information on the technique described above, you can use the ‘!heap –p -?’ command.

If we browse the section of the binary where our allocation takes place in IDA pro, we can see each argument to HeapAlloc() labeled with its name.

 .text:0041178B loc_41178B: ; CODE XREF: read_record(int)+C1j<br /> .text:0041178B mov eax, [ebp+var_74]<br /> .text:0041178E mov ecx, [ebp+var_50]<br /> .text:00411791 lea edx, [ecx+eax+5]<br /> .text:00411795 mov esi, esp<br /> .text:00411797 push edx ; dwBytes<br /> .text:00411798 push 8 ; dwFlags<br /> .text:0041179A mov eax, [ebp+hHeap]<br /> .text:0041179D push eax ; hHeap<br /> .text:0041179E call ds:__imp__HeapAlloc@12 ; HeapAlloc(x,x,x)<br /> .text:004117A4 cmp esi, esp<br />

We can see from this listing that the number of bytes allocated by HeapAlloc came from the edx register. Also you may notice that the size is a result of the calculation of ecx + eax + 5. It seems logical that this instruction might be responsible for an integer overflow, as there is no bounds checking performed on the values of eax and ecx prior to this being executed.

The final step in our exploration is to work out where the values of the variables var_50 and var_75 came from in order to determine the exact criteria that lead to our heap overflow condition. We can do this by investigating the cross references (places in the binary where the variable is used) for each variable in turn. To start this we can click on the var_50 variable and press the “X” key. This brings up a list of the x-refs for the variable.

Next we select each x-ref in turn and investigate them. Looking at the first x-ref we can see that the result of a function called “ReadString” is stored in it. We can see from the function prototype that the string takes two arguments, an integer and a (void **).

 .text:0041170B mov ecx, [ebp+arg_0]<br /> .text:0041170E push ecx ; int<br /> .text:0041170F call j_?ReadString@@YAKHPAPAX@Z ; ReadString(int,void * *)<br /> .text:00411714 add esp, 8<br /> .text:00411717 mov [ebp+var_50], eax<br /> .text:0041171A cmp [ebp+var_50], 0<br /> .text:0041171E jnz short loc_411728<br />

Since we know that the value we’re looking at is definitely an integer (we know this because it’s used as the number of bytes to allocate with HeapAlloc), we can make a guess that it’s probably the length/number of bytes read by the ReadString function. We can investigate this by reversing the function though. For conciseness sake, however, we can assume that this is true (since I wrote the vulnerable application, I’m pretty sure it’s a safe bet). Readstring reads a length from the filehandle provided. It then reads that many bytes from the file and stores it in a string. The length that was read in first is then returned by the function.

Looking at the second variable, var_74, we can see that it is used in exactly the same way, as a size value from ReadString. With this in mind, we can get a high-level overview of the vulnerability. Two length-encoded strings are read in. Their lengths are added together and the result is used to decide how many bytes to allocate. Then the strings are sprintf()’ed into the buffer. However, due to a wrap-around condition when calculating the length to allocate, the copy can go well out of the bounds of the allocated buffer.

With this information in mind we can begin the process of fixing (or exploiting) the vulnerability in question. Hopefully, if you’ve read up to here you’ve learned something from all this. If anyone is interested in receiving a copy of the binary mentioned in this post, just email nearchib@cisco.com and let me know.

For those of you analyzing bugs under platforms other than Windows, similar functionality can be achieved using Valgrind on Linux/BSD/Mac OS X. Also on Mac OS X, a custom pageheap implementation is shipped by default that can be preloaded. This can be used with DYLD_INSERT_LIBRARY=libgmalloc.dylib.

In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.


  1. Instead of !heap -x, you can use !heap -p -a””. Check out the other commands by doing “”!heap -p -?”””^0^1^^^0^0
    24357^8711^Vish^vishwadeep02@gmail.com^http://www.indianairlinesbooking.in^^2010-04-02 21:18:12^2010-04-02 21:18:12^I live in India and I spend 10 hrs each day connected to the Internet. Though Internet is not yet grown fully here but I feel it would be difficult to live without Internet.^0^1^^^0^0
    24359^8661^Achim Cristian^positivcriss@gmail.com^^^2010-04-24 02:38:04^2010-04-24 02:38:04^Thanks, i will try it :)^0^1^^^0^0
    24360^8745^Achim Cristian^positivcriss@gmail.com^http://jocuri69.net^^2010-04-02 14:29:47^2010-04-02 14:29:47^Thanks for tips. Verry usefull.^0^1^^^0^0
    24362^8711^Daryl Lau^daryllau@hotmail.com^http://www.articleszoom.org^^2010-04-02 21:28:42^2010-04-02 21:28:42^”Could you go a day without coffee?””I can’t even make it through 6 hrs without one… lol”