Security fundamentals: buffer overflow

Ilya Markevich — Mon, 01 Apr 2024 22:04:38 +0000

Preface
Basic concepts
- RAM and registers
- Stack
- Program decompilation
Buffer overflow
Buffer overflow sample
Protection against buffer overflow
- Safe code
- High-level languages
- Stack canaries
- Non-executable memory
- Address space layout randomization (ASLR)
Conclusion
References

Preface

Buffer overflow attack is one of the oldest and well documented security vulnerabilities. Despite that, it still happens quite often: not only in old legacy code but also in relatively new products.

Studying the attack was interesting - while I was exploring how buffer overflow works, I had to dive into some core concepts of how computer works (which I also will try to explain in the article).

I hope after reading the article, you will have basic understanding about buffer overflow: how it works, why it's dangerous and what are the defense mechanisms.

P.S. For code samples, I will be using C together with gcc compiler and Ghidra for programs decompilation. I'm also on Linux 6.2.0-36-generic with Intel Core i7-7700HQ x64. Keep in mind if you have different processor or operating system, the compilation/decompilation result might be different. And, if you want to experiment, I would recommend using VM not to kill your system accidentally.

Ok, let's start!

Basic concepts

Before studying the attack, let's go through some basic concepts that are essential for its understanding.

RAM and registers

To understand how buffer overflow attack can be done, we need a basic understanding of how a program is represented in RAM and how the CPU executes program instructions.

To put it simply, RAM is just a set of bytes, where each byte can be referenced by address. The address is represented with hexadecimal numbers.

Registers are quickly accessible memory located near CPU which stores important information for program execution. CPU can access the memory much faster than RAM. We will explore some registers later in the article.

The CPU sequentially reads and executes program instructions from RAM. Registers are used to quickly access some data - for example, address of the next instruction to execute (RIP register on x64 processors).

Each process in operating system has associated memory split on 4 segments:

Code segment - the segment stores executable code. Instruction pointer register (RIP) pointing to program instruction from the segment.
Globals/static variables segment - the segment stores globally declared and static variables to reuse them between all parts of the program by referencing single address in memory.
Heap segment - the segment is used for dynamically allocated memory during program execution (for example, calling malloc function in C). The segment is controlled by programmer, so it's programmers responsibility to cleanup memory when it's not needed (calling free function in C). The segment grows up, meaning new objects will be allocated in higher addresses.
Stack segment - the segment is used to store the programs stack. It's controlled by compiler, programmer doesn't have to cleanup the memory on function exit. The segment grows down, meaning the new stack frames will be allocated in lower addresses.

Stack is the most important part in understanding how buffer overflow attack can be done, so worth discussing it further.

Stack

Stack is an abstract data type with the main property that the last data placed on the stack will be the first removed. This property is usually implemented with PUSH (adding data to the stack) and POP (removing data from the stack) commands.

How is it used in the context of a program? Each function call creates a new stack (called a stack frame), which stores local variables, parameters passed to the function (for x86 processors only; for x64, parameters are passed using registers), and the return address - the address of the next instruction after the function finishes. All stack frames are placed on top of each other.

Two dedicated registers are used by the CPU to work with the stack:

RSP register contains the address of the top of the stack. This address is also known as the stack pointer.
RBP register contains some fixed address inside the active stack frame. This address is also known as the base pointer and is used for referencing local variables and function parameters inside the stack frame.

Both registers are used in x64 processors; for x86, there will be ESP / EBP respectively.

Program decompilation

Let's decompile the simple C program to showcase how the core concepts described above work together. To do that, I will be using Ghidra.

Here's the program code:

// hello.c
#include <stdio.h>

int multiply(int num1, int num2) {
 return num1 * num2;
}

int doubleNum(int number) {
 int base = 2;
 int result = multiply(base, number);

 return result;
}

void main() {
 int num = 5;
 int result = doubleNum(num);

 printf("%d\n", result);
}

I compiled the program with gcc hello.c -fno-stack-protector -o hello (more about the -fno-stack-protector flag later) and loaded the compiled program into Ghidra.

Let's set a breakpoint before entering the doubleNum function and see what memory/stack/registers look like.

There are 3 windows shown in Ghidra (I put some tags for easier referencing):

Window 1 shows the compiled listing of the program.
Window 2 shows how the program instructions are represented in memory.
Window 3 shows state of the registers.

The program is paused on 1a (2d in the compiled window), which is the call of the doubleNum function.

From the picture we can see interesting details:

RSP register contains the address of the top of the stack (stack pointer). The address is 7fffffffdc80.
RBP register contains the address of the base pointer, representing a fixed address in the current stack frame. The address is 7fffffffdc90.
Comparing RSP and RBP values, we can see that the stack grows towards lower addresses (the address of the top of the stack 7fffffffdc80 is lower than the base pointer address 7fffffffdc90).
RIP register contains the address of the next instruction to execute (5555555551a5). This matches the address of instuction we are paused on (2d from the screenshot).
Note the address differences for instructions and the stack - stack addresses start from 7ffff while instruction from 55555. That's because of the process memory organization described previously.
At the start of the main function, 16 bytes are allocated for the active stack frame (2a instruction). Then, the local variable num is written at the address RBP - 8 (7fffffffdc88).
From the instruction 2c we can see that the parameter for doubleNum function is passed through the register EDI.

Here's the stack frame for the main function:

Let's jump into the doubleNum function call. Here's the state of the program after entering the function:

We can see that RIP value has changed to the address of 2a instruction. However, what's more important, the RSP value has also changed. It happened because CALL instruction pushes the return address to the stack (the address where the program should jump to when doubleNum function finishes).

Here's the stack after entering doubleNum function:

Let's put a breakpoint before calling multiply function and check the program state:

Instructions a, b and c are called function prologue. The instructions prepare new stack frame, so let's break them down:

PUSH RBP - push previous base pointer address to the stack. It's used later to retrieve previous function base pointer when active function finishes.
MOV RBP,RSP - set base pointer for new stack frame.
SUB RSP,0x18 - allocate 24 bytes in new stack frame.

Instructions d and e put local variables to stack - instruction d puts function argument, e puts local variable base. Then instructions f and g write multiply function parameters to ESI and EDI registers.

Here's how the stack will look like after the execution of all the instructions:

Let's put a breakpoint in multiply function. I put it at the end because it has similar instructions as doubleNum. Here's the program state:

You can see that the stack pointer shifted again. Now, the stack contains old RBP value for doubleNum function. Note there's no SUB RSP, N instruction, most probably because of the compiler optimizations - no need to allocate new space in the stack if there are no calls to other functions.
Also, the result of the multiplication is set to EAX register, meaning return value is passed through the registers.

Here's the stack after execution of all the instructions:

Last 2 instructions of the function, POP RBP and RET, are called function epilogue. Let's break them down:

POP RBP - takes the value from top of the stack and put it to RBP register. In our example, top of the stack is the address of the base pointer for doubleNum function (7fffffffdc70). Stack pointer is moved to 7fffffffdc50 (as POP removes value from the stack, the stack pointer is shifted as well).
RET - takes the address from the top of the stack and put it to RIP register, meaning the program jumps to the address. Top of the stack will be return address of multiply function which is instruction in doubleNum right after calling multiply. Stack pointer is moved to 7fffffffdc58 (RET under the hood executes POP RIP).

Here's how the stack will look like after the instructions execution:

Next, epilogue for doubleNum function will be executed which will remove doubleNum stack frame. And then epilogue for main function will remove last stack frame and exit the program.

I hope now you have basic understanding how program is organized in memory and how instructions are executed. With the knowledge we can dive deeper into buffer overflow attack.

Buffer overflow

Imagine there is code in a C program that either delcares fixed length buffer on the stack (e.g. char buffer[10];) or on the heap (char* buffer = malloc(10)). Buffer overflow happens if attacker is able to put data into the buffer that exceedes the length of the buffer. Depending on where the buffer is placed, the attack can target either stack or heap.

If buffer overflow happens on the heap, it can override other data located near the buffer or even other processes data.

If overflow happens on the stack, there are multiple attack options:

the simplest will be overflowing the buffer with a random value. This will most likely crash the program because function return address will be overwritten with memory address which contains invalid instructions.

overwrite local variable(s) (or a function pointer) to change the program's behaviour.

overwrite the return address of the function with address containing specific instructions. The simplest attack implementation will be changing the return address to the overflowed buffer address and placing malicious code in the buffer itself.

Let's look at buffer overflow in details with code sample.

Buffer overflow sample

To explore the attack, I will use the code sample written in C:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void printGreetings(char nameInput[]) {
 char name[10];

 strcpy(name, nameInput);
 printf("Hello, %s\n", name);
}

void askQuestion() {
 printf("How are you?\n");
}

void finish() {
 exit(0);
}

void main() {
 char name[] = ""; // input

 printGreetings(name);
 askQuestion();
 finish();
}

I compiled it with gcc -fno-stack-protector hello.c -o hello and disabled ASLR protection (more about the protection later) with echo 0 > /proc/sys/kernel/randomize_va_space.

The program is simple: it takes a name, prints greetings, prints a question and exits. For simplicity, I will set the name directly in the program.

Let's start with setting the name to "Ilya". Here's the program output:

Hello, Ilya
How are you?

Ok, seems like program works. Let's try to hack it with buffer overflow!

As you might already noticed, the potential issue is in the line strcpy(name, nameInput);. Since strcpy doesn't check the bounds of the name buffer, we can pass nameInput of any size and it will overflow buffer name.

The simplest way to execute a buffer overflow attack would be to pass a randomly long name, for example aaaaaaaaaaaaaaaaaaaaaaaaaaa. After trying the value, I get

Hello, aaaaaaaaaaaaaaaaaaaaaaaaaaa
Segmentation fault

This is exactly the case of buffer overflow with randomly overwritten return address:

The program crashes after execution of printGreetings function because it tries to jump to memory address with invalid instructions.

Ok, this was simple. Let's try something more interesting.
How about changing programs flow? We can override return address and execute real instructions instead of jumping to some random addresses.

Changing return address is not easy, and todays processors and operating systems have many defense mechanisms in place to make guessing of return address and execution of malicious instructions difficult. Howerver, it's still possible to overcome the defenses.

For simplicity, I will find the needed address in Ghidra. Let's decompile the program and check what's the state of memory before calling strcpy(name, nameInput):

The stack frame has standard structure. From lower to higher memory addresses:

local variables: name[10], 10 bytes.
base pointer: 8 bytes, but used 6.
return address: 8 bytes, but used 6.

Also, note that base pointer and return address values has written from the least significant bit to the most one because of little-endian notation (for example, base pointer value 7fffffffdc90 written in memory like 90dcffffff7f).

What we want is to override return address of the function using buffer name[10]. That means we have to construct input that will override all the bytes starting from name[10] through base pointer and till return address where return address should be real address we want to jump to.

For our exploit, let's jump to finish function, so we are going to skip askQuestion call. Finding the finish call instruction address using Ghidra:

The address is 55555555526b. Now, we have all the information to construct the malicious input: 10 random bytes to fill name[10] + 8 random bytes to override base pointer + 6 bytes of return address written in little-endian notation. Here's how result input will look like: 111111111122222222\x6b\x52\x55\x55\x55\x55.

Let's try the input:

Hello, 111111111122222222kRUUUU

It works! By exploiting the buffer overflow, we were able to change the program flow (skip askQuestion function call) and exit without any errors.

That was very simplified version of buffer overflow, but I hope now you can see how dangerous the attack can be - attacker can potentially execute any code, for example execute commands under root/admin user.

Let's explore what defense mechanisms are used to prevent the attack.

Protection against buffer overflow

Safe code

There are functions in C that don't apply bounds checking on buffers, so it's possible to execute buffer overflow attacks targeting them (strcat, strcpy, gets etc.). It's recommended to use safer equivalents of the functions (strncat, strncpy, fgets etc.). Luckily, today's C compilers show warnings if programmer tries to use the unsafe functions. These warnings shouldn't be ignored and should be fixed as soon as possible.

Although there are static code analyzers that can indicate potentially insecure code, unfortunately it's not possible to completely eliminate buffer overflow attacks with this defense. There is legacy code and libraries that programmers can't control, as well as runtime code that is impossible to analyze.

High-level languages

As you could see previously, buffer overflow attacks mostly target C/C++ code because of the low-level API for memory management.
The usage of high-level languages such as Java/C# can prevent such attack. However, a lot of software (inlcuding operating systems and web servers) is written in C/C++ for performance and other reasons.

Stack canaries

Stack canaries is a technique used by compiler to prevent buffer overflow attacks. The idea is simple: the compiler places a random value (canary) on the stack before each return address. After a function is executed, the compiler checks if the canary value has changed before jumping to the specified return address. If it has changed (likely due to a buffer overflow attack), the program will be interrupted without jumping to the specified return address.

If I compile the buffer overflow example without the -fno-stack-protector parameter, the compiler detects the buffer overflow and aborts the program

Hello, 000000000011111111kRUUUU
*** stack smashing detected ***: terminated
Aborted

The technique is simple yet effective, making the buffer overflow attack difficult to execute. However, it's not a silver bullet. The canary value can be brute-forced if it remains unchanged upon program restart. Although this sounds like a rare case, it still occurs in reality. For example, a process created with fork call (Linux) inherits the canary value from parent process. This is the case for some web server implementations: when a process crashes, the main web server process spawns a child process with the same canary value. An attacker can attempt to brute-force the canary value byte by byte and use server crashes as an indicator that the canary value is incorrect.

The stack canary is enabled by default in gcc compilers today.

Non-executable memory

Basic buffer overflow attack includes overwriting function return address to the overflowed buffer memory address with malicious instructions. Non-executable memory prevents execution of the instructions by marking the stack memory segment as non-executable.

While this defense prevents classical buffer overflow attacks, it can still be exploited. Sometimes, executable memory is required to generate code on the fly, for example with JIT compilation. The exploit can potentially be injected during this phase. Another example is the ROP attack which uses already existing instructions outside of the stack segment.

Non-executable memory is enabled by default in gcc compiler today. You can disable it with -z execstack parameter during compilation (do it only for study purposes).

Address space layout randomization (ASLR)

To exploit buffer overflow, the attacker needs to know the exact address of instructions to jump to. ASLR prevents this by randomizing the stack/heap/code addresses, so the address will be different on every process restart.

ASLR is enabled by default on all modern operating systems (Windows/Linux/Mac). On Linux, the setting can be found in /proc/sys/kernel/randomize_va_space (it has a default value 2, meaning enabled). With this defense in place, the buffer overflow sample I used will fail with a segmentation fault because the return address hardcoded in the program will no longer be valid.

Conclusion

I tried to show why buffer overflow is still dangerous, the problems it can create and the defense mechanisms that operating systems and compilers use to prevent the attack.

I hope you found the article interesting. Thank you for reading!

Forem: Ilya Markevich