Shellcode is defined as a small piece of code used as the payload in the exploitation of a software vulnerability. You can read a much more detailed analysis on Wikipedia here. Which is where I got the definition from.
I bring up shellcode at this point because in the exploit program from HTAE exploit_notesearch.c we see our first example of shellcode. We will go over the code when we get to actually exploiting our programs.
By now you should have been working in a Linux environment for at least the last few posts. If you are totally new to Linux then you might not be familiar with shells. If you have heard of them but aren’t really familiar with them this might be a worthwhile read for some new information.
That thing you type all the commands in on a Linux box is called a shell. Most likely you’re using a bash shell, which stands for bourne again shell, if you haven’t gone in and changed it. If you’re running BSD the default shell is going to be tcsh on FreeBSD, and mostly likely for FreeBSD based systems. Windows uses powershell. It’s good to get familiar with powershell as well but that will come down the road.
The shell is where we get command prompts. In Linux systems like Ubuntu server and Fedora server distro’s a shell is the only default option you get for using the OS. Its a very good idea to get comfortable with using shells. It is the lowest user space program available to us as we use the computer as a normal user. (User space and kernel space are important concepts you should understand as well as we are learning about exploiting operating systems, you can learn about them briefly here.) The root user on the other hand is another thing entirely. The root user can execute any command we would like on the system. We can even issue commands that will crash our system. That is why I said it is generally a bad idea to set a program to run as the root user on execution. If you give a program root privileges then it can do anything it wants on your system. Which would be really nice to have as an attacker.
Interacting With the Shell
The usefulness of a shell is what we can do with it. There are two things that make interacting with a shell to accomplish specific tasks easier. Perl and Shellscripting. Perl is a programming language that allows us to manipulate input in the command line in easier ways. For example if you didn’t find it fun to count out 30 A’s in our previous exploits we could have used the command (full disclosure this example is straight from HTAE)
$ perl -e 'print "A" x 30;'
We could have then used this as our program input in the form of:
$ ./auth_overflow $(perl -e 'print "A"x30')
Which is a much more precise way of passing what we want to to our program.
Shellscripts are just text files and that allow us to combine multiple shell commands and instructions into a single file we can call easily. We will be looking at them in more detail later. The important part to know right now is that you specify the type of script to be run in the file and then the commands. To invoke it you run it just like any other file after you have set permissions etc.
Hexadecimal and ASCII
You should be familiar with hexadecimal numbers. The memory addresses we have been seeing so far have all been in hex. We can represent each of the alphabet letters in a hexadecimal form using ASCII. We already saw an example of this in the overflow_example.c program. When the value variable was overwritten we saw the memory address fill up with 414141 which. That happened because the ASCII representation for A is 41. We can use the ASCII representation of a character in our shellcode to pass into a program. The trick is that we have to use \x41 to pass A to our program. Here’s an example.
$ perl -e 'print "\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21";' Hello World!
I simply looked up what each character is on the ASCII table and entered that characters hex value in order to get the output.
Here is an example of using ASCII and perl with the buffer_overflow.c program.
$ ./overflow_example $(perl -e 'print "A"x16 . "\xad\xaa\xba"') [BEFORE] buffer_two is at 0xbffff28c and contains 'two' [BEFORE] buffer_one is at 0xbffff294 and contains 'one' [BEFORE] value is at 0xbffff29c and is 5 (0x00000005) [STRCPY] copying 19 bytes into buffer_two [AFTER] buffer_two is at 0xbffff28c and contains 'AAAAAAAAAAAAAAAA���' [AFTER] buffer_one is at 0xbffff294 and contains 'AAAAAAAA���' [AFTER] value is at 0xbffff29c and is 12233389 (0x00baaaad)
The first item to point out is that the shellcode was entered in reverse order. This is because of the way that the computer loads values into memory. That’s a discussion that can be found here.
The second item to point out is that we see the word baaaad written in the memory address at value. Don’t confuse the baaaad with the letters because this is just the hexadecimal numbers. If we change the input in the shellcode we will have a different result.
$ ./overflow_example $(perl -e 'print "A"x16 . "\xoh\xno"') [BEFORE] buffer_two is at 0xbffff28c and contains 'two' [BEFORE] buffer_one is at 0xbffff294 and contains 'one' [BEFORE] value is at 0xbffff29c and is 5 (0x00000005) [STRCPY] copying 20 bytes into buffer_two [AFTER] buffer_two is at 0xbffff28c and contains 'AAAAAAAAAAAAAAAAohno' [AFTER] buffer_one is at 0xbffff294 and contains 'AAAAAAAAohno' [AFTER] value is at 0xbffff29c and is 1869506671 (0x6f6e686f) Segmentation fault
Here we see that the characters of the shellcode were copied into the buffers as characters. However when the overflow happened the ASCII values were copied into the memory addresses.
The last thing we are going to talk about this time are nop sleds. The term nop stands for no operation. The nop sled is a series of no operation instructions, which means that the instruction does nothing, which take up memory. The point is to fill up a section of memory so that if the execution lands in that portion of memory it will start executing the nop instructions until it gets to the memory containing the executable payload. Essentially making the attack target larger.
Imagine trying to hit a target in the exact center with an arrow. But we aren’t that good of a shot. So what we do is we make the target into a cone shape so that if he hit anywhere in the cone it will force the arrow to the center of the target (as long as we ignore some physics but you get the idea). Now all we need to do is to hit anywhere on the cone of the much larger target and we will get the arrow to the center.
The nop sled structure can be used in an attempt to counteract address randomization. It can also be used to help when we don’t know the exact location of the address we need. It has been used so frequently that some intrusion detection methods search for nop sleds.
Next post we are going to explore the exploit_notesearch.c program from HTAE. We will introduce an nop sled and a piece of shellcode to the notesearch.c program and use it to exploit a vulnerability.
We are also starting to get into some discussions about key operating system pieces. Such as user space, kernel space, shells, little endianness, and so on. I would suggest at the very least to be looking up these terms and if possible get a hold of an operating system textbook or course notes. Here is a free operating systems textbook. It doesn’t go into much detail but from what I’ve skimmed through it at least gives an overview of what’s going on.