Linux Buffer Overflows x86 – Part 3 (Shellcoding): Shellcode Basics and Advanced Topics

humphryschoenwette
Aug 19, 2023
9 min read

Several new techniques were explored in this part of the series focusing mainlyon ROP and return to libc to overcome the non-executable stack protection. TheNX-bit only provides a small amount of security when it comes to exploiting abuffer overflow. It prevents the attacker from writing his/her own instructionsdirectly into the programs memory space. However, if the instructions arealready loaded into the process, this protection is ineffective.

The exploits have shown that this protection is only a minor hinderance to gainingfull control of a process via a buffer overflow. The next article of theseries, which will focus on Address Space Layout Randomization, will show howASLR in combination with the NX-bit will make exploitation much harder. Thefinal article in the series will look into StackGuard and the stack canary. Bythe end of the series the reader should be convinced that buffer overflows arestill very capable of being exploited in a modern system. Ultimately theprogrammer is responsible for protecting against these kinds of exploits, and noamount of kernel or compiler protections can make up for faulty code.

Linux Buffer Overflows x86 – Part 3 (Shellcoding)

Download File

Even this situation is not mandatory, there are common cases like buffer overflows where the strcpy() function is used. This function, will copy a string byte by byte and it will stop when it will encounter a NULL byte. So if the shellcode contains a NULL byte, strcpy function will stop at that byte and the shellcode will not be complete and as you can guess, it will not work correctly.

Attackers have managed to identify buffer overflows in a many products and components. A few examples are the PS2 Independence exploit, the Twilight Hack for the Wii or an iDevice Lock Activation Bypass on the iPad.

In order to understand how buffer overflows work, we need to understand what happens in memory when a program is run. In this example we're using a C program in Linux. However, note that the issue applies to many different languages and operating systems.

NOTE: For the purpose of this article and its examples, I have disabled protective measures, like Address Space Layout Randomization (ASLR), that may interfere with a clear demonstration of the buffer overflow issue. There are ways to bypass these measures, but that's a (more advanced) topic in itself. In this article, I will focus on the core principle of buffer overflows.

I am creating a training on buffer overflows and stack/heap attacks. I am working on an Ubuntu 12.04 x86_64 machine and want to show some sample buggy programs and the ways you could exploit those vulnerabilities.

The long gone era of 32 bit and old school stack buffer overflows seems to have gone with the introduction of memory randomization, canary variables, ASLR and 64bit addresses (making it harder to escape bad bytes in shellcode). Yet so if we ever want to work in the field of security and Ethical hacking, we need to know some skills of hacks that were very common in the bygone era. Buffer overflows are one of the biggest ones that will help you learn how to think the way a black hat hacker would think. In this case scenario, we will be taking a peek at 64bit buffer overflows.

As you can see, we created a simple program that just copies arguments sent to the argv[1] array into the buffer[] character array using the strcpy() function. As so, nothing seems to be wrong. Yet, there is one part that was missed:

Without checking the length before copying. Anything that has been passed as an argument in argv[1] irrespective of length will be copied into buffer[]. Now we all know what happens when we try to stuff something that's bigger than a container can handle. The same way a full glass of water overflows if you try to pour more water into it. The buffer[] array will overflow. In the case of the glass of water, water will overflow onto the surface holding the glass or so it's support structure, and as such buffer[] will overflow into other areas in memory adjacent to it in the same stack frame (overflow into the area adjacent it in main()). By doing so, we may be able to find areas in memory that we can write to simply by overflowing into them. Making execution of our program or even variables in our program change.

With all this in mind. If we overflowed buffer[], we could reach a part of memory that reads executable code at a specific address. As you see in the previous diagram, all variables are above the return address. The return address seems like something that points to a memory address to continue execution. What if we manipulated it to read a different memory address than it was designed to? Maybe a memory address that contains a variable we have control over. What if we put executable code into buffer[], at the same time overflowed buffer[] to write into the return address to go and read the contents of buffer[] as executable code?

0x90 represents a NOP or so no operation instruction where no operation is executed or so nothing happens and execution passes along to the next memory address. 0xcc is what is called a hard breakpoint and will cause our program to halt and give a SIGTRAP exception upon exit. Although shellcodes allow us to do many things such as spawn a root command shell (Application needs SetUID enabled). We are using 0xcc as our shellcode command as it is the simplest one byte instruction to show you how buffer overflows work. As we know, memory addresses sometimes change. In order to mitigate such, we must create something called an NOP sled.

If you are already familiar with basic stack-based buffer overflows under Windows, your TL;RD of this text is:attacking a WINE applications feels a lot like attacking a target running on a Windows XP SP2:with DEP, but without effective ASLR.Many old buffer overflow tricks work as expected, and you can use Linux tools for the exploit development.Also, Linux shellcodes can be used (which are normally a lot smaller than comparable Windows payloads).

We are trying to generate a shellcode for use in an exploit, specifically buffer overflows. Exploits take advantage of programing oversights in the original programming logic, and many exploitable buffer overflow type occurrences in programs are related to the C function, strcpy. This function copies whatever is specified by the programmer into wherever he wants it to be copied. However, the strcpy only does what it is supposed to do, and that is copy data until it receives a nullbyte. This is how some buffer overflows occur. Say we have the following code:

IntroductionYou will do a sequence of labs in 6.5660. These labs will give youpractical experience with common attacks and counter-measures. To make theissues concrete, you will explore the attacks and counter-measures in thecontext of the zoobar web application in the following ways:Lab 1: you will explore the zoobar web application,and use buffer overflow attacks to break its security properties.Lab 2: you will improve the zoobar web application by using privilegeseparation, so that if one component is compromised, the adversary doesn't getcontrol over the whole web application.Lab 3: you will build a program analysis tool based on symbolicexecution to find bugs in Python code such as the zoobar web application.Lab 4: you will improve the zoobar application against browser attacks.Lab 5: you will add HTTPS support and security key (WebAuthn) authentication.Lab 1 will introduce you to buffer overflow vulnerabilities, in the context of aweb server called zookws. The zookws web server runs asimple python web application, zoobar, with which users transfer "zoobars"(credits) between each other. You will find buffer overflows inthe zookws web server code, write exploits for the buffer overflows toinject code into the server over the network, and figure out how to bypassnon-executable stack protection. Later labs look at other security aspects ofthe zoobar and zookws infrastructure.Each lab requires you to learn a new programming language or some other pieceof infrastructure. For example, in this lab you must become intimately familiarwith certain aspects of the C language, x86 assembly language, gdb,etc. Detailed familiarity with many different pieces of infrastructureis needed to understand attacks and defensesin realistic situations: security weaknesses often show up in corner cases, andso you need to understand the details to craft exploits and design defenses forthose corner cases. These two factors (new infrastructure and details) can makethe labs time consuming. You should start early on the labs and work on themdaily for some limited time (each lab has several exercises), instead of tryingto do all exercises in a single shot just before the deadline. Take the time tounderstand the relevant details. If you get stuck, post a question on Piazza.Several labs, including this lab, ask you to design exploits. These exploitsare realistic enough that you might be able to use them for a real attack, butyou should not do so. The point of the designing exploits is to teachyou how to defend against them, not how to use them---attacking computer systemsis illegal(see MIT network rules)and can get you into serious trouble. Don't do it.NOTE: Since we re-use the same lab assignments across years,we ask that you please do not make your lab code publicly accessible(e.g., by checking your solutions into a public repository onGitHub). This helps keep the labs fair and interesting for students infuture years.Lab infrastructureExploiting buffer overflows requires precise control over the executionenvironment. A small change in the compiler, environment variables, orthe way the program is executed can result in slightly different memorylayout and code structure, thus requiring a different exploit. For thisreason, this lab uses avirtual machine torun the vulnerable web server code.To start working on this lab assignment, you'll need software that lets you runa virtual machine. For Linux users, we recommend running the course VM onKVM, which is built into the Linuxkernel. KVM should be available through your distribution, and is preinstalledon Athena cluster computers; on Debian or Ubuntu, try apt-get install qemu-kvm. KVM requireshardware virtualization support in your CPU, and you must enable this support inyour BIOS (which is often, but not always, the default). If you have anothervirtual machine monitor installed on your machine (e.g., VMware), that virtualmachine monitor may grab the hardware virtualization support exclusively andprevent KVM from working.On Windows, or Linux without KVM, useVMware Workstation fromIS&T. On a Mac, useVMWare Fusion.If you are using a computer with a non-x86 CPU (such as a Mac with anARM M1 or M2 processor), running the VM locally on your computer may beprohibitively slow, because your VM will have to emulate x86 instructionsinstead of running them natively. You can instead run the course VM onAmazon's EC2 cloud computing platform; detailed instructions can be foundhere.Once you have virtual machine software installed on your machine, you shoulddownloadthe course VMimage, and unpack it on your computer. This virtual machine contains aninstallation of Ubuntu 22.04 Linux.To start the course VM using VMware, import 6.5660-standalone-v23.vmdk.Go to File > New, select "create a custom virtual machine", choose Linux> Debian 9.x 64-bit, choose Legacy BIOS, and use an existing virtual disk(and select the 6.5660-standalone-v23.vmdk file, choosing the "Take thisdisk away" option). Finally, click Finish to complete the setup.To start the VM with KVM, run ./6.5660-standalone-v23.sh from a terminal (Ctrl+A xto force quit). If you get a permission denied error from this script,try adding yourself to the kvm group with sudo gpasswd -a `whoami` kvm,then log out and log back in.You will use the student account in the VM for your work. The passwordfor the student account is student. You can also get access tothe root account in the VM using sudo; for example, you caninstall new software packages usingsudo apt-get install pkgname.You can either log into the virtual machine using its console, or use ssh tolog into the virtual machine over the (virtual) network. The latter also letsyou easily copy files into and out of the virtual machine with scp orrsync. How you access the virtual machine over the network depends onhow you're running it. If you're using VMWare, you'llfirst have to find the virtual machine's IP address. To do so, log inon the console, run ip addr show dev eth0, andnote the IP address listed beside inet. With kvm, you can uselocalhost as the IP address for ssh and HTTP. You can now log in withssh by running the following command from your host machine:ssh -p 2222 student@IPADDRESS.For security, SSH does not allow logging in over the network using a password(and, in this specific case, the password is known to everyone).To log in via SSH, you will need to set up anSSHKey.You may also find it helpful to create a host alias for your 6.5660 VM inyour /.ssh/config file, so that you can simply run, for example,ssh 5660vm or scpfile.txt 5660vm:lab/file.txt. To do this, add the following linesto your /.ssh/config file, adjusted as needed:Host 5660vm User student HostName localhost Port 2222Getting startedThe files you will need for this and subsequent labsare distributed using the Git version controlsystem. You can also use Git to keep track of any changes you make to theinitial source code. Here's anoverviewof Git and theGituser's manual, which you may find useful.The course Git repository is available at -pdos/6.5660-lab-2023.To get the lab code, log into the VM using the student account andclone the source code for lab 1 as follows:student@65660-v23:$ git clone -pdos/6.5660-lab-2023 labCloning into 'lab'...student@65660-v23:$ cd labstudent@65660-v23:/lab$It's important that you clone the course repository into the lab directory, because the length of pathnameswill matter in this lab.Before you proceed with this lab assignment, make sure you can compilethe zookws web server:student@65660-v23:/lab$ makecc zookd.c -c -o zookd.o -m64 -g -std=c99 -Wall -Wno-format-overflow -D_GNU_SOURCE -static -fno-stack-protectorcc http.c -c -o http.o -m64 -g -std=c99 -Wall -Wno-format-overflow -D_GNU_SOURCE -static -fno-stack-protectorcc -m64 zookd.o http.o -o zookdcc -m64 zookd.o http.o -o zookd-exstack -z execstackcc -m64 zookd.o http.o -o zookd-nxstackcc zookd.c -c -o zookd-withssp.o -m64 -g -std=c99 -Wall -Wno-format-overflow -D_GNU_SOURCE -staticcc http.c -c -o http-withssp.o -m64 -g -std=c99 -Wall -Wno-format-overflow -D_GNU_SOURCE -staticcc -m64 zookd-withssp.o http-withssp.o -o zookd-withsspcc -m64 -c -o shellcode.o shellcode.Sobjcopy -S -O binary -j .text shellcode.o shellcode.bincc run-shellcode.c -c -o run-shellcode.o -m64 -g -std=c99 -Wall -Wno-format-overflow -D_GNU_SOURCE -static -fno-stack-protectorcc -m64 run-shellcode.o -o run-shellcodestudent@65660-v23:/lab$ The component of zookws that receives HTTP requests is zookd. It is written in C and serves static files and executes dynamic scripts. For this lab you don't have to understand the dynamic scripts; they are written in Python and the exploits in this lab apply only to C code. The HTTP-related code is in http.c. Here is a tutorial about the HTTP protocol. 2ff7e9595c

Linux Buffer Overflows x86 – Part 3 (Shellcoding): Shellcode Basics and Advanced Topics

Linux Buffer Overflows x86 – Part 3 (Shellcoding)

Recent Posts

Comments