In this article we will learn about Buffer overflow & format string attacks.
In the previous article, we learned about the basics of buffer overflows, how attackers exploit this vulnerability, and then the various defenses that can be used around buffer overflows, such as the concept of canaries and the no-start stack. In this installment of the series, we will learn about a very well-known but insidious form of attack known as a format string attack.
This attack is also the cause of insecure or faulty programming. We will also understand this vulnerability better with an example where I will explain the vulnerability using various stack diagrams and finally see what defenses can be used against this vulnerability. This vulnerability allows attackers to abuse the functions and use them to read and manipulate information from memory. Let’s discuss this vulnerability in great detail below.
Format chain attacks
These vulnerabilities are associated with the “printf” command. Yes, you read that right – ‘printf’. Suppose a programmer writes code and uses the printf command to print something. It uses the following printf:
Now you may argue, “What’s the difference between the two since they both compile without any errors?” Imagine that the output is set to “%d” in the first printf. The printf command will dutifully interpret the output as a format for printing a decimal integer and then go onto the stack to get the integer. Since there is none, printf prints a nonsense value, which should not be considered an error, since we can successfully print a nonsense value using the printf command.
Let’s dance in a crowd
To understand this attack even better, we will demonstrate this attack with an example and then trace it on the stack.
Reading from stack
Consider the following sprintf command
“sprintf(buffer, buffer_size, input);”
There is no format string in this sprint command, so it can be challenged with a format string. Suppose an attacker enters “%x%x%x” in the above input, then the above sprint happens
sprintf (buffer, buffer_size, “%x%x%x”);
Now this input from the attacker is interpreted as a format string and sprint fetches three more hexadecimal values from the stack and loads them into a variable buffer. If we now release:
We will see the value if the next three hexadecimal values from the buffer.
Writing to stack
For now, we’ve read the contents of the stack by passing a format string as user input, but we can also write to the stack. Let’s see how.
The “%n” format is used to store the number of characters before %n is encountered. For example, consider the following printf command:
This loads the number 6 into the test memory location. Note that we have just written to the memory location of the variable ‘test’ using printf. Now let’s try a more complex example. Suppose there is some value in the stack that the attacker wants to change, and the following is a program to do it:
snprintf(buffer, buffer_size, input)
So the corresponding stack will look like this
Now let’s say the value you want to change is at 0xaffbfca0. This can be obtained from an attacker by looking at the source code or by printing the contents of the stack. So in this example, let’s say the user input:
“xa0xfcxfbxaf%d%n” so the sprint command happens
snprintf(buffer, sizeof_buffer, “xa0xfcxfbxaf%d%n”) and the stack will look like this:
Let’s understand this user input. Note that the “”, which is used for escaping, and the x indicates a hexadecimal number. So 2 hexadecimal numbers translate to 1 ASCII character, so the input is 4 ASCII characters. After this attacker type %d, it means to print a decimal integer, but what integer does it print. Look at the stack diagram above and the next value on the stack after the input is the integer “a” which is set to 1, so now there are five characters in total in the buffer (4 ASCII + 1 decimal). So the stack will look like this
Then the next thing that comes to the input is “%n” and as mentioned earlier, this format string is used to store the number of characters, which in this case is 5, and writes it into memory for the next argument. What place in memory would one ask? snprintf will look for the next argument. It is provided, but in this case there is no such argument, so it looks on the stack and selects the next entry, which is the buffer, which is loaded with “xa0xfcxfbcaf”, which will be interpreted as 0xaffbfca0 in memory (because it is interpreted as little endian) and thus the value 5 is written in this place:
So we have seen how we can write a number into a memory location. Now this location in memory can be where the return pointer is located, so by overwriting it attackers can take control of the program.
Formatting detection of string attacks and defenses
The following section describes how we can detect and defend against format string attacks.
- Whenever user input is provided with (“),%x,%d,%n, it is likely that a format string attack is taking place.
- The best way to defend against a format string attack is to ensure that the programmer includes format strings in calls to the printf, sprint, fprintf, snprintf functions.
- Deploy all patches whenever possible.
- So in this article we have seen how a small function like printf can lead to serious problems if not handled properly/safely.