CS 105

Code Injection Attacks

In this part of the lab, you will be attacking the ctarget program using different approaches to trigger buffer overflows. ctarget's stack positions are consistent from one run to the next so that data on the stack can be treated as executable code. This choice makes the program vulnerable to attacks where the exploit strings contain byte encodings of executable code.

Understanding and Exploiting ctarget

The ctarget program reads strings from standard input using the function getbuf:

unsigned getbuf()
{
    char buf[BUFFER_SIZE];
    Gets(buf);
    return 1;
}

The Gets function here is similar to the standard library's gets function: it reads a string from standard input (terminated by \n or end-of-file) and stores it (along with a null terminator) at the specified destination. In this code, you can see that the destination is an array buf, declared as having BUFFER_SIZE bytes. At the time your targets were generated, BUFFER_SIZE was a compile-time constant that is specific to your executable.

See the deeper-dive section below for more!

The Gets() and gets() functions have no way to determine whether their destination buffers are large enough to store the string they read. They simply copy sequences of bytes—if the data being copied is larger than the memory allocated for the destination, they'll simply continue to write the remaining data into contiguous memory, creating a memory overrun—a buffer overflow. If nothing else was using that memory, the program may appear to continue to run normally, but if that memory was in use, or if the programmer later allocates memory in the overrun location, perhaps without zeroing it out, that was written to, you have memory corruption, which leads to our old frenemy, undefined behavior.

Looking at our getbuf function, we can see that if the string typed by the user and read by getbuf is sufficiently short, getbuf will return 1, as shown by the following execution example:

Note that the value of your program's Cookie field will differ.

./ctarget
Cookie: 0x1a7dd803
Type string:Keep it short!
No exploit.  Getbuf returned 0x1
Normal return

But an error occurs if you type a long string:

./ctarget
Cookie: 0x1a7dd803
Type string:This is not a very interesting string, but it has the property ...
Ouch!: You caused a segmentation fault!
Better luck next time

As the error message indicates, overrunning the buffer typically causes the program state to be corrupted, leading to a memory-access error. Your task is to be more clever about crafting the strings you feed ctarget so that they do more interesting things. The strings that cause such behaviors are called exploit strings.

Your exploit strings will typically contain byte values that do not correspond to the ASCII values for printing characters. The (provided) program hex2raw will enable you to generate these raw strings. You'll probably want to keep the documentation for hex2raw open in a browser window for reference.

Steps

Rules for This Assignment

  • You must do the assignment on wilkes.

  • Your solutions may not use attacks to circumvent the validation code in the programs.

    Specifically, any address you incorporate into an attack string for use by a ret instruction should be to one of the following destinations:

    • The addresses for functions touch1, touch2, or touch3.
    • The address of your injected code.
  • Your exploit string must not contain byte value 0x0a (ASCII newline or \n) because the Gets function uses 0x0a to detect the end of a string.

  • hex2raw expects two-digit hex values separated by one or more white spaces.

    So if you want to create a byte with a hex value of 0, you must write it as 00. To create the word 0XDEADBEEF you should pass ef be ad de to hex2raw (note the reversal required for little-endian byte ordering).

  • The hex2raw page also contains examples of using shell I/O redirection (< and >) and pipes (|) on the command line, as shown in examples.

Exploiting Memory Overflows

In most cases, a memory overrun will cause the program to crash, but a clever attacker can arrange to use the overrun to insert processor instruction code that will get picked up by the program and alter its behavior.

A classic attack of this type takes advantage of memory corruption to cause a targeted program to spawn a shell, giving the attacker the same access as the original program; if that program was, say, running as root, the attacker will have the same powers as root, and be able to do almost anything on the machine.

There have been many defenses developed for these sorts of vulnerabilities, ranging from having the operating system segmenting memory, which limits what memory can be accessed by a particular process; to randomizing address space (ASLR), so attackers can't know where any particular routine resides, making it hard to target; to controlling what resources a process can access and what actions it can take (i.e., “capabilities”; you might be familiar with SELinux).

Despite all these measures, there are still new vulnerabilities being found all the time in code that's been in production for many years, but also in newly written code, because while there are languages that make memory overruns more difficult or impossible to write (e.g., Rust), they typically have (or are seen as having) drawbacks that mean that potentially less-secure languages (e.g., C, C++) are still used for many projects.

To Complete This Part of the Assignment

You'll know you're done with this part of the assignment when you've done all of the following:

(When logged in, completion status appears here.)