We will discuss Shellcode Analysis- Basics in this article.
In this article we will look at what is shellcode, what is its purpose and different shellcode patterns etc. Please note that this article will not cover how to write shellcode and it is beyond the scope of this article.
Introduction [Shellcode Analysis]:
Shellcode is a sequence of bytes that represent assembly instructions. Please note that this is not a build guide, just another way to show them. For example, x90 is a hexadecimal way of representing the “nop” instruction. Now, shellcode and malware have a long history, and historically shellcodes have been used to create a shell on an infected system. However, over time, the capabilities of shellcode have increased drastically, and malware authors have found new ways to de-obfuscate code while increasing impact by including shellcode. A basic shellcode example is as below, which targeted a popular IRC client. The shellcode below is a hexadecimal representation of the build instructions. More details about this exploit can be found here.
Since shellcode has a very small area of memory to fit in and work from there, very often its main purpose is just to download another malware component that will then be fully functional. Thus, shellcode often helps in dropping or downloading another malicious component in the infection chain. Shellcode also needs to get information about its environment, such as what Windows API calls it can make.
Fortunately, we don’t need to pick up an Intel or other architecture manual and map each value to the corresponding instruction because we have debuggers and disassemblers that know how to interpret these opcodes. For example, below is an example of displaying the above shellcode in radar2:
As mentioned above, shellcode has a very buffer to live in, and further needs to know about its environment, which will contain the addresses of local data and variables, in order to use them. But how can an attacker know where the shellcode is located in the memory of the target process?
Shellcode can determine the address where it is by looking in the EIP register, because the EIP (Instruction Pointer) stores the address of the next instruction to be executed. However, another question, how will the shellcode use this registry value since it can’t be accessed directly? Shellcode developers can use techniques like:
E8 00000000 CALL
58 POP EAX
What this instruction set does is apply a CALL instruction to zero bytes. what will it do? Once the CALL instruction is executed, it stores the EIP on the top of the stack. Now the POP is read from the top of the stack (which is EIP in this case) and put into EAX.
This pattern of CALL immediately followed by POP can be easily detected by an anti-malware solution, so shellcode developers usually modify the above logic by incorporating additional instructions. For example, look at the instruction sets below:
00000017 EB 03 JMP SHORT 0000001C
00000019 5E POP ESI
0000001A EB 05 JMP SHORT 00000021
0000001C E8 CALL 00000019
00000021 ADD ESI,3
The first instruction is a JMP SHORT instruction with opcode EB and is followed by an argument indicating the relative byte offset by which to increment EIP. What do you think the next instruction will be? Will it be the one on 0000001A? No, it will be an instruction at 0000001C because the EIP already has 00000019 in it and adding 3 bytes will get it to 0000001C. Now the instruction at 0000001C calls 00000019, but before that it puts the EIP on the top of the stack. The instruction at 00000019 transfers the top of the stack (which is EIP) into the ESI register. The next instruction at 0000001A will execute and take us to 00000021 where the shellcode will continue to execute.
There are also some other scenarios where the first stage shellcode tries to look for the second stage shellcode. This technique is called egg-hunting. This is useful for attackers when the buffer space to accommodate shellcode is very small and can only be used to point to second-stage code that can be placed in large buffers.
Now during shellcode execution it looks for certain Windows APIs to perform its functions, but it is not necessary for these libraries to be loaded into memory, but it is good for shellcode that a common DLL known as kernel32.dl is usually loaded into systems and can use LoadLibraryA to load the libraries and GetProcAddress to find the address of the exported function. But now you have to think about how the shellcode finds kernel32.dll in the first place. The answer lies in a structure called a Process Environment Block (PEB), which contains details about a process, including information about loaded DLLs. But now again, how to find PEB? I hope you’re not confused by how we backtrack to find kernel32.dll. So there is a FS register that points to the thread information block. In the TIB on the PEB, the pointer is at offset 0x30. This is the address we were looking for. Now to load the PEB all we have to do is put FS:[0x30h] in the registry. When EAX is pressed, it holds the PEB, which can then be parsed to find the kernel32.dll file. Instructions can be as simple as:
MOV EAX,DWORD PTR FS:[30h]
Once Kernel32.dll is found, the shell code uses GetProcAddress to find the address of the API function it needs. There is another way to call the API without calling GetProcAddress, i.e. by parsing the DLL’s export directory table and comparing the ASCII names of the functions it needs to resolve (usually a hash is compared to save even more buffer space)