This article is about Stack Based Buffer Overflow in Win 32 Platform.
what is Stack Based Buffer Overflow in Win 32 Platform?
Buffer overflow is a very common and widely known software security vulnerability. It is a condition which occurs when more data is written into a block of memory (buffer) than it is allocated to hold. As buffers are created to hold a definite amount of data, the excess information stored gets overflowed to the adjacent buffers, causing overwriting or damaging the valid data which is already stored.
In order to exploit buffer overflow, one should have basic knowledge about topics like stacks, CPU registers, memory allocation, etc.
As it is a very vast topic in itself, in this article we will try to understand the basics of the Stack Based Buffer Overflow.
First of all, we will create a simple C program and cover the basics, like how the program runs in the memory, how the function call takes place, what is the return address, etc. So let’s start with the basics.
A stack is a continuous block of memory which is used to store the temporary data within your program. The stack works on a Last in First out (LIFO) basis. PUSH and POP are the two working functions in which PUSH is used to put data into the stack and POP is used to remove the data from the stack, and it grows downwards towards lower memory addresses to higher memory addresses on Intel based systems.
In Intel 32 bit architecture the maximum data size would be 4 bytes, which is equal to 32 bits for each PUSH and POP operation. Basically, the stack holds following types of data of the program.
- Argument to the Function
- Calling Function Address
- Return Address
- Local Variable
and couple of other things.
We will see each of them in detail further in this article. Before that, let us install some tools needed for the practical session. Here, we are using the following setup.
We have configured Windows XP Service Pack 2 on Virtual Machine.
Immunity Debugger (We can download it by searching Google or we can by clicking on the URL which is given in the references at the end of the article.)
Dev C ++ (We can download the tool by clicking the form below.)
Note: We are assuming that you have configured the required tools and have basic knowledge about assembly language.
First of all, we will start looking at the things like function call in the memory and return address etc. from the very starting stage. We have already written a simple C program in which we have defined a function which is called from the main program.
The EXE file and source code of the program are given at the end of the article. We can download the EXE and the source code of the files by clicking the URL given at the end of the article. Let’s have a look at the source code of the program so that we can understand the basic concepts.
As seen in the image above, we wrote a simple program in which we defined some local variables; then we called the function with one argument value. Then we defined a function that will print a message – “Function is called” then we returned a value of 1 to the main program.
After compiling the program, we open the program using the Immunity Debugger. We can open the program by clicking on the file menu or by dragging and dropping the .exe file into the debugger. Then we will see the following type of code on the screen.
Now we see four sections on the screen. Each section represents a different type of CPU processing statistics. Each section is defined below.
- In this section we can see the output of de-assemble the .exe file.
- This section provides information about the various registers and their values. In the above screen we can see different types of registers and their values.
- In this section we can see the program memory dump. We see what type of data the program has written to memory. We can also dynamically adjust these values as required.
- This is the most important part: it shows the status of the program stack.
If we look closely at the screen, we can see the inscription “paused” in the right corner of the window. This means that after opening the program with the debugger, the debugger paused the program, so we will have to start the program manually.
Now let’s start our analysis.
The first step is the identification of the main function and other functions that we have defined in the program, which will then be loaded into the computer’s memory. If we scroll down in the first area, which is the output of the de-assemble program, we can see the main ASCII function named EXE and another function with symbolic instruction language code. In our case, the name of the EXE file was First.exe, so we can see the same thing on the screen with some extra value.
We have pointed out some numbers in the screenshot above. These numbers are defined in the section below for better understanding.
- This is the main feature in the symbolic instruction language. We can see it in the screenshot title “First.xxxxx ASCII “Main Function””. In this case, First.xxx is the name of the .exe file we uploaded to the debugger. On the left side we see the memory address according to each assembly manual. This is basically the physical memory address of the instruction. In our case, 00401290 is the address at which the program was launched. And 004012DA is the address (in our case) in memory where the program ended.
- It is a function call from the main program. This line basically calls the function we defined in the C program.
If we take a closer look at the main function call, we can see that it calls “First.004012D3”, in which the last 8 digits are the address of the function that will be called.
Finally, we can see the function loaded into memory, which prints the value “Function is call” using the printf function that was called from the program. The function was triggered by pushing the EBP register onto the stack. In our case, the starting address of the function is “004012DE” and the ending address of the function is “004012FE”. Both addresses can be seen in the screenshot above.
It is the most important part because we will see how we can define a breakpoint in the program. A breakpoint helps us freeze the program at that point and allows us to analyze things like register status, the stack, and the frame pointer.
It also allows us to change values dynamically. So let’s create a breakpoint just before calling the function. We can set a breakpoint by selecting a line by simply clicking on it and pressing F2 to create a breakpoint.
As shown in the above screenshot after creating the break point, the row has been highlighted. Now, we need to start the program. We can do it by hitting the F9 key. We will see some changes in the numbers on the screen.
In the above screenshot, we can see that the stack value has been changed and the register value has also been changed. Another point of interest is the EIP register, which holds the value we set the breakpoint to.
Now we will have to use two options.
- Enter the
Step Into: if we want to execute another instruction after creating a breakpoint then we will have to use Step Into. The keyboard shortcut for Step Into is F7.
Step Over: Sometimes we don’t want to go into the details of the function call, so in such situations we can use Step Over. The keyboard shortcut for Step Over is F8.
Now we need to move to the next instruction, so press F7 as shown in the above step. Let’s now analyze the stack.
As shown in the above screenshot, nothing interesting is showing in window no 4, but we can see in window no 1 that the next instruction has been executed, and the next instruction address has been assigned into the EIP register in window no 2.
- We will execute the next instruction by pressing the F7 key. After this step, the screen will look like this:
We see that another instruction has been executed and we see a lot of changes on the screen. First, we notice that the controller of the executed instruction has reached the address specified in the previous function call. This can be seen in window #1.
If we look closely at window #4, we see the address 004012D4 at the top of the stack, which is the return address to the main program. This means that after the execution of the function using this address is complete, the counter will go to the main program and the execution of the program is complete.
So in this article we learned…
- Analysing the Stack
- Creating the Break Point
- Analysing the Function Call and Register Status.