Simple Stack Overflow Example.

Last time we took a look at the memory of a running process. We focused on the stack in particular.  This time we are going to create a simple buffer overflow and use gdb to take a look at how the overflow works in memory.

The Program.
We are going to use the overflow_example.c program in Hacking The Art of Exploitation. Note that this program uses the strcpy function, which is inherently unsafe because there is no check that the input is the correct size. This is desirable in a program to demonstrate the danger of unchecked input, but not in general.  If we were trying to be safe in our program we should use strncpy. That said here’s the code:

//A program to demonstrate a buffer overflow.
#include 
#include 

int main(int argc, char *argv[])
{
    int value = 5;
    char buffer_one[8], buffer_two[8];

    strcpy(buffer_one, "one"); //Put "one" in buffer one
    strcpy(buffer_two, "two"); //Put "two" in buffer two

    printf("[BEFORE] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
    printf("[BEFORE] buffer_one is at %p and contains \'%s\'\n", buffer_one, buffer_one);
    printf("[BEFORE] value is at %p and is %d (0x%08x)\n", &value, value, value);

    printf("\n[STRCPY] copying %d bytes into buffer_two\n\n", strlen(argv[1]));
    strcpy(buffer_two, argv[1]); //copy first command line argument into buffer two.

    printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
    printf("[AFTER] buffer_one is at %p and contains \'%s\'\n", buffer_one, buffer_one);
    printf("[AFTER] value is at %p and is %d (0x%08x)\n", &value, value, value);
    
}

We are declaring an int and two buffers of size 8 bytes.  Then we copy some values into each.  When we execute the program we give a command line argument that is then copied into buffer_two and if we make it large enough we see it overwriting buffer_one.

This program is not much more complicated than the example program from last time in terms of functionality.  We call a function that moves some data and then print some lines.  However disassembling main in gdb we see things got a whole lot more complicated from the assembly standpoint. The output is the following:

(gdb) disass main
Dump of assembler code for function main:
   0x0804845b <+0>:     lea    ecx,[esp+0x4]
   0x0804845f <+4>:     and    esp,0xfffffff0
   0x08048462 <+7>:     push   DWORD PTR [ecx-0x4]
   0x08048465 <+10>:    push   ebp
   0x08048466 <+11>:    mov    ebp,esp
   0x08048468 <+13>:    push   ebx
   0x08048469 <+14>:    push   ecx
   0x0804846a <+15>:    sub    esp,0x20
   0x0804846d <+18>:    mov    ebx,ecx
   0x0804846f <+20>:    mov    DWORD PTR [ebp-0xc],0x5
   0x08048476 <+27>:    lea    eax,[ebp-0x14]
   0x08048479 <+30>:    mov    DWORD PTR [eax],0x656e6f
   0x0804847f <+36>:    lea    eax,[ebp-0x1c]
   0x08048482 <+39>:    mov    DWORD PTR [eax],0x6f7774
   0x08048488 <+45>:    sub    esp,0x4
   0x0804848b <+48>:    lea    eax,[ebp-0x1c]
   0x0804848e <+51>:    push   eax
   0x0804848f <+52>:    lea    eax,[ebp-0x1c]
   0x08048492 <+55>:    push   eax
   0x08048493 <+56>:    push   0x8048600
   0x08048498 <+61>:    call   0x8048310 <printf@plt>
   0x0804849d <+66>:    add    esp,0x10
   0x080484a0 <+69>:    sub    esp,0x4
   0x080484a3 <+72>:    lea    eax,[ebp-0x14]
   0x080484a6 <+75>:    push   eax
   0x080484a7 <+76>:    lea    eax,[ebp-0x14]
   0x080484aa <+79>:    push   eax
   0x080484ab <+80>:    push   0x8048630
   0x080484b0 <+85>:    call   0x8048310 <printf@plt>
   0x080484b5 <+90>:    add    esp,0x10
   0x080484b8 <+93>:    mov    edx,DWORD PTR [ebp-0xc]
   0x080484bb <+96>:    mov    eax,DWORD PTR [ebp-0xc]
   0x080484be <+99>:    push   edx
   0x080484bf <+100>:   push   eax
   0x080484c0 <+101>:   lea    eax,[ebp-0xc]
   0x080484c3 <+104>:   push   eax
   0x080484c4 <+105>:   push   0x8048660
   0x080484c9 <+110>:   call   0x8048310 <printf@plt>
   0x080484ce <+115>:   add    esp,0x10
   0x080484d1 <+118>:   mov    eax,DWORD PTR [ebx+0x4]
   0x080484d4 <+121>:   add    eax,0x4
   0x080484d7 <+124>:   mov    eax,DWORD PTR [eax]
   0x080484d9 <+126>:   sub    esp,0xc
   0x080484dc <+129>:   push   eax
   0x080484dd <+130>:   call   0x8048340 <strlen@plt>
---Type  to continue, or q  to quit---
   0x080484e2 <+135>:   add    esp,0x10
   0x080484e5 <+138>:   sub    esp,0x8
   0x080484e8 <+141>:   push   eax
   0x080484e9 <+142>:   push   0x804868c
   0x080484ee <+147>:   call   0x8048310 <printf@plt>
   0x080484f3 <+152>:   add    esp,0x10
   0x080484f6 <+155>:   mov    eax,DWORD PTR [ebx+0x4]
   0x080484f9 <+158>:   add    eax,0x4
   0x080484fc <+161>:   mov    eax,DWORD PTR [eax]
   0x080484fe <+163>:   sub    esp,0x8
   0x08048501 <+166>:   push   eax
   0x08048502 <+167>:   lea    eax,[ebp-0x1c]
   0x08048505 <+170>:   push   eax
   0x08048506 <+171>:   call   0x8048320 <strcpy@plt>
   0x0804850b <+176>:   add    esp,0x10
   0x0804850e <+179>:   sub    esp,0x4
   0x08048511 <+182>:   lea    eax,[ebp-0x1c]
   0x08048514 <+185>:   push   eax
   0x08048515 <+186>:   lea    eax,[ebp-0x1c]
   0x08048518 <+189>:   push   eax
   0x08048519 <+190>:   push   0x80486bc
   0x0804851e <+195>:   call   0x8048310 <printf@plt>
   0x08048523 <+200>:   add    esp,0x10
   0x08048526 <+203>:   sub    esp,0x4
   0x08048529 <+206>:   lea    eax,[ebp-0x14]
   0x0804852c <+209>:   push   eax
   0x0804852d <+210>:   lea    eax,[ebp-0x14]
   0x08048530 <+213>:   push   eax
   0x08048531 <+214>:   push   0x80486ec
   0x08048536 <+219>:   call   0x8048310 <printf@plt>
   0x0804853b <+224>:   add    esp,0x10
   0x0804853e <+227>:   mov    edx,DWORD PTR [ebp-0xc]
   0x08048541 <+230>:   mov    eax,DWORD PTR [ebp-0xc]
   0x08048544 <+233>:   push   edx
   0x08048545 <+234>:   push   eax
   0x08048546 <+235>:   lea    eax,[ebp-0xc]
   0x08048549 <+238>:   push   eax
   0x0804854a <+239>:   push   0x804871c
   0x0804854f <+244>:   call   0x8048310 <printf@plt>
   0x08048554 <+249>:   add    esp,0x10
   0x08048557 <+252>:   lea    esp,[ebp-0x8]
   0x0804855a <+255>:   pop    ecx
   0x0804855b <+256>:   pop    ebx
   0x0804855c <+257>:   pop    ebp
   0x0804855d <+258>:   lea    esp,[ecx-0x4]
   0x08048560 <+261>:   ret    
---Type  to continue, or q  to quit---
End of assembler dump.

There are also some registers we didn’t see last time.  ECX, EBX, and EDX.  EBX is called the base register and is used as a base pointer for memory access.  ECX is the counter register used for loops and shifts.  EDX is the data register used for I/O port access, arithmetic, and some interrupts.  You can find out more information about registers here. Which is where I got my information from.

Some new assembly instructions in this set consist of: lea, and, push, and pop.  lea stands for ‘load effective address’ which points at the memory address for the next instruction to load from.  Note that in my case the address is at esp+0x4 and is loaded into ecx.  and is the logical operation.  push and pop are stack operations.  push places something onto the stack and pop places a value on the top of the stack into the register passed as an argument.  So if the value on top of the stack is 12 the instruction pop ebx will store the value 12 in the ebx register.  Now that we have some more information about our program internals we can look at what happens when we run it.

Execution.

I ran the program the first time with a command line argument chosen to be less than buffer_two’s memory allocation.  I wanted to see what the intended set up for a program like this looks like.  The output is as follows:

$ ./overflow_example abcdefg
[BEFORE] buffer_two is at 0xbffff2bc and contains 'two'
[BEFORE] buffer_one is at 0xbffff2c4 and contains 'one'
[BEFORE] value is at 0xbffff2cc and is 5 (0x00000005)

[STRCPY] copying 7 bytes into buffer_two

[AFTER] buffer_two is at 0xbffff2bc and contains 'abcdefg'
[AFTER] buffer_one is at 0xbffff2c4 and contains 'one'
[AFTER] value is at 0xbffff2cc and is 5 (0x00000005)

The output is self explanatory.  Everything looks like it should.  I ran this through gdb and did a similar exploration to the last post.  There are two main events that we should be concerned with.  The first is when we initialize the variables and the second is when we copy the command line argument into buffer_two.  Everything else is printing data or the variable declaration.

Reading symbols from ./overflow_example...done.
(gdb) list
1       //A program to demonstrate a buffer overflow.
2       #include 
3       #include 
4
5       int main(int argc, char *argv[])
6       {
7           int value = 5;
8           char buffer_one[8], buffer_two[8];
9
10          strcpy(buffer_one, "one"); //Put "one" in buffer one
(gdb) 
11          strcpy(buffer_two, "two"); //Put "two" in buffer two
12
13          printf("[BEFORE] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
14          printf("[BEFORE] buffer_one is at %p and contains \'%s\'\n", buffer_one, buffer_one);
15          printf("[BEFORE] value is at %p and is %d (0x%08x)\n", &value, value, value);
16
17          printf("\n[STRCPY] copying %d bytes into buffer_two\n\n", strlen(argv[1]));
18          strcpy(buffer_two, argv[1]); //copy first command line argument into buffer two.
19
20          printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
(gdb) 
21          printf("[AFTER] buffer_one is at %p and contains \'%s\'\n", buffer_one, buffer_one);
22          printf("[AFTER] value is at %p and is %d (0x%08x)\n", &value, value, value);
23          
24      }
(gdb) break 12
Breakpoint 1 at 0x8048488: file overflow_example.c, line 12.
(gdb) break 19
Breakpoint 2 at 0x804850e: file overflow_example.c, line 19.
(gdb) 

I put break points right after each action so we can examine the pointers and memory.  Running the program and examining the registers esp, ebp, and eip we get.

Breakpoint 1, main (argc=2, argv=0xbffff324) at overflow_example.c:13
13          printf("[BEFORE] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
(gdb) i r esp ebp eip
esp            0xbffff250       0xbffff250
ebp            0xbffff278       0xbffff278
eip            0x8048488        0x8048488 <main+45>
(gdb) x/16xw $esp
0xbffff250:     0xbffff4b8      0x0000002f      0x08049930      0x006f7774
0xbffff260:     0x00000002      0x00656e6f      0xbffff330      0x00000005 (1)
0xbffff270:     0xbffff290      0xb7fbd000      0x00000000      0xb7e2da63
0xbffff280:     0x08048570      0x00000000      0x00000000      0xb7e2da63

We can see the location of value at (1).  Note that in the initialization, and this is demonstrated later very nicely, value is pushed on the stack above both buffers.  We can also print what is contained in each buffer and examine their memory.

(gdb) print buffer_two
$1 = "two\000\002\000\000"
(gdb) print buffer_one
$2 = "one\000\060\363\377\277"
(gdb) x/8xw &buffer_two
0xbffff25c: 0x006f7774 0x00000002 0x00656e6f 0xbffff330
0xbffff26c: 0x00000005 0xbffff290 0xb7fbd000 0x00000000
(gdb) x/8xw &buffer_one
0xbffff264: 0x00656e6f 0xbffff330 0x00000005 0xbffff290
0xbffff274: 0xb7fbd000 0x00000000 0xb7e2da63 0x08048570

Which is interesting to see it in the memory.  Remember here that there are 8 bytes allocated to buffer_one and buffer_two.  If we subtract the values for the memory addresses we see the 8 byte difference.  If you look carefully you can see buffer_one stacked on top of buffer_two and value stacked on top of buffer_one in the x/8xw &buffer_two output.  Note that it also looks backwards but that’s because of the little endian architecture of the x86.  When we hit continue we go to the next break point.

Breakpoint 2, main (argc=2, argv=0xbffff324) at overflow_example.c:20
20          printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
(gdb) i r esp ebp eip
esp            0xbffff250       0xbffff250
ebp            0xbffff278       0xbffff278
eip            0x804850e        0x804850e <main+179>
(gdb) x/16xw $esp
0xbffff250:     0xbffff4b8      0x0000002f      0x08049930      0x64636261
0xbffff260:     0x00676665   (2)0x00656e6f      0xbffff330      0x00000005 (1)
0xbffff270:     0xbffff290      0xb7fbd000      0x00000000      0xb7e2da63
0xbffff280:     0x08048570      0x00000000      0x00000000      0xb7e2da63

Note that esp and ebp have the same values. Which should be expected because the number and structure of the frames on the stack haven’t changed any. We can see the contents of buffer_two have changed though, which are bold above.  This reflects copying the command line argument into buffer_two.  As we should expect buffer_one (2) and value (1) are still the same.  Nothing all that exciting is happening right now, but I like how this demonstrates how we are changing the values in memory and how the stack is placed in memory.

Lets Break it.

So lets execute the buffer overflow using the same break points as before.  Running the program normally we get the following output:

$ ./overflow_example AAAAAAAAAAAAAA
[BEFORE] buffer_two is at 0xbffff2ac and contains 'two'
[BEFORE] buffer_one is at 0xbffff2b4 and contains 'one'
[BEFORE] value is at 0xbffff2bc and is 5 (0x00000005)

[STRCPY] copying 14 bytes into buffer_two

[AFTER] buffer_two is at 0xbffff2ac and contains 'AAAAAAAAAAAAAA'
[AFTER] buffer_one is at 0xbffff2b4 and contains 'AAAAAA'
[AFTER] value is at 0xbffff2bc and is 5 (0x00000005)

This time we input 14 ‘A’s as a command line argument. Which is 6 bytes more than buffer_two has room for. Since there is no protection on strcpy we copy 14 bytes into the 8 byte buffer. Since buffer_one is 8 bytes after buffer_two buffer_one gets whatever is after 8 bytes copied into it. So lets look at what happens in gdb.

(gdb) run AAAAAAAAAAAAAA

Starting program: /home/exploit/Hacking/Exploits/overflow_example AAAAAAAAAAAAAA

Breakpoint 1, main (argc=2, argv=0xbffff324) at overflow_example.c:13
13          printf("[BEFORE] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
(gdb) i r esp ebp eip
esp            0xbffff250       0xbffff250
ebp            0xbffff278       0xbffff278
eip            0x8048488        0x8048488 <main+45>
(gdb) x/16xw $esp
0xbffff250:     0xbffff4b1      0x0000002f      0x08049930      0x006f7774
0xbffff260:     0x00000002      0x00656e6f      0xbffff330      0x00000005
0xbffff270:     0xbffff290      0xb7fbd000      0x00000000      0xb7e2da63
0xbffff280:     0x08048570      0x00000000      0x00000000      0xb7e2da63
(gdb) print buffer_two
$7 = "two\000\002\000\000"
(gdb) print buffer_one
$8 = "one\000\060\363\377\277"
(gdb) x/8xw buffer_two
0xbffff25c:     0x006f7774      0x00000002      0x00656e6f      0xbffff330
0xbffff26c:     0x00000005      0xbffff290      0xb7fbd000      0x00000000
(gdb) x/8xw buffer_one
0xbffff264:     0x00656e6f      0xbffff330      0x00000005      0xbffff290
0xbffff274:     0xb7fbd000      0x00000000      0xb7e2da63      0x08048570

Same set up as before on the first break point. A quick glance allows us to again identify the locations of buffer_two, buffer_one, and value in the stack. I’ve already talked about this above so lets examine what happens at break point two.

(gdb) cont
Continuing.
[BEFORE] buffer_two is at 0xbffff25c and contains 'two'
[BEFORE] buffer_one is at 0xbffff264 and contains 'one'
[BEFORE] value is at 0xbffff26c and is 5 (0x00000005)

[STRCPY] copying 14 bytes into buffer_two


Breakpoint 2, main (argc=2, argv=0xbffff324) at overflow_example.c:20
20          printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
(gdb) i r esp ebp eip
esp            0xbffff250       0xbffff250
ebp            0xbffff278       0xbffff278
eip            0x804850e        0x804850e <main+179>
(gdb) x/16xw $esp
0xbffff250:     0xbffff4b1      0x0000002f      0x08049930      0x41414141
0xbffff260:     0x41414141      0x41414141      0xbf004141      0x00000005
0xbffff270:     0xbffff290      0xb7fbd000      0x00000000      0xb7e2da63
0xbffff280:     0x08048570      0x00000000      0x00000000      0xb7e2da63
(gdb) print buffer_two
$9 = "AAAAAAAA"
(gdb) print buffer_one
$10 = "AAAAAA\000\277"

Here’s where we see something happening. We can see the ‘A’s filling up everything starting at buffer_two and continuing into buffer_one. Its interesting to point out though that our program still keeps the same regions as buffer_two and buffer_one. We can easily break the program even more. Lets push a lot more into buffer_two:

$ ./overflow_example AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
[BEFORE] buffer_two is at 0xbffff29c and contains 'two'
[BEFORE] buffer_one is at 0xbffff2a4 and contains 'one'
[BEFORE] value is at 0xbffff2ac and is 5 (0x00000005)

[STRCPY] copying 30 bytes into buffer_two

[AFTER] buffer_two is at 0xbffff29c and contains 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
[AFTER] buffer_one is at 0xbffff2a4 and contains 'AAAAAAAAAAAAAAAAAAAAAA'
[AFTER] value is at 0xbffff2ac and is 1094795585 (0x41414141)
Segmentation fault

Our program crashed.  This time I will skip the first break point because it’s the same as the previous two situations.  The interesting portion is at break point two.

Breakpoint 2, main (
    argc=<error reading variable: Cannot access memory at address 0x41414141> 
    argv=<error reading variable: Cannot access memory at address 0x41414145>
    at overflow_example.c:20
20          printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
(gdb) i r esp ebp eip
esp            0xbffff240       0xbffff240
ebp            0xbffff268       0xbffff268
eip            0x804850e        0x804850e <main+179>

So we already see something new. The program is running a fault because it is trying to access memory at 0x41414141. Which is where the segmentation fault comes in above, we somehow told the program to try and access memory that it doesn’t have access to.  The other thing that jumps out here is the values of esp and ebp.  In the two previous runs of our program those registers were the same, but now we have new values in them.  Thats a good sign that the stack has changed. Lets look at the stack.

(gdb) x/16xw $esp
0xbffff240:     0xbffff4a1      0x0000002f      0x08049930   (3)0x41414141
0xbffff250:     0x41414141   (2)0x41414141      0x41414141   (1)0x41414141
0xbffff260:     0x41414141      0x41414141      0x00004141      0xb7e2da63
0xbffff270:     0x08048570      0x00000000      0x00000000      0xb7e2da63

Thats a big change. We knew that buffer_two(3) and buffer_one(2) would be overwritten. Since value(1) is after buffer_one we should have expected it to be overwritten as well. But it doesn’t stop there. The data went past value and started overwriting other information that was stored on the stack.  Lets look at the memory that ebp is pointing to:

(gdb) x/xw $ebp
0xbffff268:     0x0000414

This is actually a nice outcome because the location of ebp is obvious.  We could have done the same thing counting bits as well though. Just to see what happens I decided to break the program even more and pushed even more into buffer_two.

(gdb) cont
Continuing.
[BEFORE] buffer_two is at 0xbffff22c and contains 'two'
[BEFORE] buffer_one is at 0xbffff234 and contains 'one'
[BEFORE] value is at 0xbffff23c and is 5 (0x00000005)

[STRCPY] copying 60 bytes into buffer_two


Breakpoint 2, main (
    argc=, 
    argv=)
    at overflow_example.c:20
20          printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);
(gdb) i r esp ebp eip
esp            0xbffff220       0xbffff220
ebp            0xbffff248       0xbffff248
eip            0x804850e        0x804850e <main+179>

We now have 60 bytes we are copying into the 8 byte buffer.  Notice the difference in the stack pointer.  The stack has grown by 32 bytes.  Lets take a look at the memory around esp and ebp:

(gdb) x/16xw $esp
0xbffff220:     0xbffff483      0x0000002f      0x08049930      0x41414141
0xbffff230:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff240:     0x41414141      0x41414141      0x41414141(ebp) 0x41414141
0xbffff250:     0x41414141      0x41414141      0x41414141      0x41414141
(gdb) x/16xw $ebp
0xbffff248:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff258:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff268:     0xbffff300      0xb7fed79a      0x00000002      0xbffff2f4
0xbffff278:     0xbffff294      0x0804994c      0x0804821c      0xb7fbd000

We have overwritten quite a bit of information now. Its the same as before just more of it.

Conclusion.

In this program we got a nice demonstration of what happens when we push more into a buffer than was intended.  It was a contrived example but I found it to be informative to look at how the stack changed.  There are a few implications that we can think about from this example.

We saw that as we pushed more into buffer_two the stack grew, that would imply to me that there is something we are pushing against something.  I noticed that the stack didn’t start growing until we started writing past the value memory.  When we pushed 30 bytes into buffer_two the stack grew by 16 bytes.  When we pushed 60 bytes we doubled the growth to 32 bytes.  However we still overwrote beyond the value memory address.

How far do we want to overwrite in a buffer overflow?  Crashing the program is accomplishing something but it would be nicer if we could crash the program by executing some code we put there instead.  How would we do this blindly and still control what is overwritten?  Can we figure out where a return address is and point that to something we want?  Can we overwrite the whole stack frame and possibly grab eip and point it somewhere?  As we continue I will go over the answers to some of these questions (hopefully all) and more.

I had a lot of fun going through all this information and I hope readers will find something useful here.  Next time I will experiment with this same program with all of the security features left on in my Debian 32-bit distro.  Then I will switch to a Debian 64-bit system and compare what occurs in the processes.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s