When receiving external input in Ethical hacking, the application must allocate memory to store that input. Many high-level programming languages do this behind the scenes, but some languages (like C/C++) allow the programmer to allocate memory directly using functions like malloc.
A buffer overflow vulnerability occurs when an application tries to store more data in the allocated memory than there is room for. This can happen for a variety of reasons, including:
- Failure to check input length while reading
- You forgot to allocate space for the null terminator
- Input lengths that cause an integer to overflow
Regardless of the reason, if an application tries to write to memory outside the scope of its allocated buffer, it means that it is writing to memory allocated for other purposes within the application. Due to the structure of how memory is allocated on a computer, this can be extremely useful to an attacker as it allows them to control the execution of a program.
Exploiting a buffer overflow[Ethical hacking]
Exploiting the buffer overflow vulnerability is fairly simple. A buffer overflow vulnerability exists if a program incorrectly allocates memory for user input or insecurely reads data into that memory space. A hacker can exploit this vulnerability simply by providing an application with more input than the allocated buffer can hold.
A buffer overflow with nonsensical or random input is likely to cause a segmentation fault or program error. However, the structure of the stack means that a well-designed buffer overflow exploit can do much more, allowing an attacker to control the flow of execution and run malicious code on the system.
An application can allocate memory on the stack or on the heap. The stack is commonly used for function arguments and local variables, and the heap stores dynamic memory (allocated using the new command in C++). Both the stack and the heap can be exploited with a buffer overflow attack, but the stack structure is extremely vulnerable.
As the name suggests, the stack is organized as a memory stack. The stack grows “down” from high memory addresses to lower ones. The current position in the stack is indicated by a variable (stack pointer) that points to the current top of the stack. As data is added to or removed from the stack, the stack pointer is also updated.
As shown in the figure above, the stack contains several different types of variables. When a function is called by another function, information is put on the stack to give that function the data it needs to execute. This data is pushed onto the stack in the following order:
- Arguments to the called function (in reverse order)
- The address of the next instruction after the called function is returned
- Local variables of the called function
User input to a function will usually be stored in a local variable, meaning it will be located in the memory space directly above the return address on the stack. This is useful for an attacker performing a buffer overflow, as the memory that will be overwritten by the buffer overflow is a pointer to the next instruction to be executed.
Return oriented programming (ROP)
The fact that an attacker can overwrite the return address of a function on the stack is the basis for return-oriented programming (ROP). In ROP, the attacker attempts to exploit a buffer overflow that causes the vulnerable function to return to the attacker’s control area of the program.
This area can be the same buffer overflowed during the attack or another area controlled by the user. If successful, the attacker may be able to convince the application to interpret the provided input as program instructions, allowing the attacker to execute malicious shellcode.
One of the main challenges of ROP is developing code that does what the attacker wants in a limited space. Because of this, shellcode normally tries to call library functions that are already inside the process’s memory space to shorten the necessary code. Some mitigations against ROP focus on making this feature more difficult for shellcode to find and run.
Buffer overflow mitigation
Exploiting a buffer overflow can be a serious security threat because the ROP code injected and executed by an attacker executes with the same privileges as the exploited application. However, there are several means of preventing or mitigating buffer overflow attacks.
The primary goal of a buffer overflow exploit is to allow an attacker to execute arbitrary code through return-oriented programming. Several different solutions have been implemented to help protect against ROP.
Stack the canaries
In order to perform a ROP on the stack, an attacker must be able to rewrite the function’s return address to point to an area of memory under his control. The stack canary concept was invented to help detect and prevent this.
A stack canary is a value known to the program that is inserted before the return address on the stack. Before the function returns, the value of the canary is checked and if it is not correct, an error is raised (indicating that a buffer overflow attack has occurred).
Data Trigger Prevention
Return-oriented programming relies on user input specified by the program to be interpreted as data to be interpreted as code. This is possible because data and control information are often interwoven in the stack without clear boundaries.
Data execution prevention (DEP) marks certain areas of memory as non-executable. This helps protect against buffer overflows, because even if an attacker can modify the return address to point to his shellcode, the program won’t run it. However, data execution prevention can be circumvented by a return-to-libc attack, so address space layout randomization (ALSR) is necessary.
Randomizing the layout of the address space
Most applications are designed to be object-oriented, with applications making heavy use of shared libraries that they import into their memory space. While the functions in these shared libraries are useful for legitimate code, they are also useful for ROP.
Address Space Layout Randomization (ASLR) is designed to make it harder for an attacker to find the library functions they need. Instead of importing address setting library functions in each application, ASLR randomly determines where a particular library will be imported. This makes ROP difficult because an attacker needs to find the library in memory before being able to use its functions.
Buffer overflow attacks are caused when an attacker writes more data to a block of memory than the application has allocated for that data. This is possible for many reasons, but the most common is to use unlimited reads, which read until a null terminator is found on the input. By using fixed-length reads designed to fit into the allocated buffer space, an application can be immune to buffer overflow attacks.
Integer overflow check
A buffer overflow can also be enabled by an integer overflow vulnerability. This occurs when the length of the value stored in the variable is larger than the variable can hold, causing the variable to miss the most significant bits that cannot fit. As a result, a very large input can be interpreted as having a shorter length due to an overflow, causing the allocated buffer to be undersized.
Checking for integer overflow in input lengths is important to protect against buffer overflow attacks.
While C++ allows the developer to manually allocate memory for user input, that doesn’t mean it’s always a good idea. The C++ Standard Template Library (STL) has functions (like strings) that handle memory management properly behind the scenes. Switching from C-strings to strings is an easy way to mitigate the threat of a buffer overflow vulnerability.
Conclusion: Buffer Overflow for Ethical Hacking
A buffer overflow is a simple vulnerability that is easy to exploit and easy to fix. Even today, however, the software contains an exploitable buffer overflow vulnerability. In October 2018, a buffer overflow vulnerability was discovered in Whatsapp that allowed exploits if a user had just answered a malicious voice or video call.
These vulnerabilities are definitely worth checking when doing an ethical hack and should be patched in the code as soon as possible.
- Function Calls, Part 3 (Frame Pointer and Local Variables), Codeguru
- Getting around non-executable stack (and fix), SecLists
- CVE-2019-3568 Detail, NIST