1. Homepage
  2. Programming
  3. [2021] CS 351 Systems Programming - The Attack Lab: Understanding Buffer Overflow Bugs

[2021] CS 351 Systems Programming - The Attack Lab: Understanding Buffer Overflow Bugs

Engage in a Conversation
CS351Systems ProgrammingIITCache simulation & optimizationAttack LabCSAPPASMAssembly

CS 351: Systems Programming
The Attack Lab: Understanding Buffer Overflow Bugs
CourseNana.COM

1 Introduction CourseNana.COM

This assignment involves generating a total of five attacks on two programs having different security vul- nerabilities. Outcomes you will gain from this lab include: CourseNana.COM

  • You will learn different ways that attackers can exploit security vulnerabilities when programs do not safeguard themselves well enough against buffer overflows.
  • Through this, you will get a better understanding of how to write programs that are more secure, as well as some of the features provided by compilers and operating systems to make programs less vulnerable.
  • You will gain a deeper understanding of the stack and parameter-passing mechanisms of x86-64 machine code.
  • You will gain a deeper understanding of how x86-64 instructions are encoded.
  • You will gain more experience with debugging tools such as GDB and OBJDUMP.

Note: In this lab, you will gain firsthand experience with methods used to exploit security weaknesses in operating systems and network servers. Our purpose is to help you learn about the runtime operation of programs and to understand the nature of these security weaknesses so that you can avoid them when you write system code. We do not condone the use of any other form of attack to gain unauthorized access to any system resources. CourseNana.COM

You may want to study Sections 3.10.3 and 3.10.4 of the CS:APP3e book as reference material for this lab. CourseNana.COM

2 Logistics CourseNana.COM

As usual, this is an individual assignment. You will generate attacks for target programs that are custom generated for you. CourseNana.COM

2.1 Getting Files CourseNana.COM

As with the previous lab, start by claiming your repository on GitHub via the invitation on the course website. After accepting it, clone your repository and in it run the command ./gettarget.sh CourseNana.COM

This will retrieve and unpack your lab files into a directory named targetk containing the files described below. You should only run ./gettarget.sh once. If for some reason you retrieve multiple targets, just choose one to work on and delete the rest. CourseNana.COM

The files in targetk include: CourseNana.COM

README.txt: A file describing the contents of the directory CourseNana.COM

ctarget: An executable program vulnerable to code-injection attacks CourseNana.COM

rtarget: An executable program vulnerable to return-oriented-programming attacks CourseNana.COM

cookie.txt: An 8-digit hex code that you will use as a unique identifier in your attacks. CourseNana.COM

farm.c: The source code of your target’s “gadget farm,” which you will use in generating return-oriented programming attacks. CourseNana.COM

hex2raw: A utility to generate attack strings. 2.2 Important Points CourseNana.COM

Here is a summary of some important rules regarding valid solutions for this lab. These points will not make much sense when you read this document for the first time. They are presented here as a central reference of rules once you get started. CourseNana.COM

3 CourseNana.COM

  • Your solutions may not use attacks to circumvent the validation code in the programs. Specifically, any address you incorporate into an attack string for use by a ret instruction should be to one of the following destinations:

– The addresses for functions touch1, touch2, or touch3. – The address of your injected code
– The address of one of your gadgets from the gadget farm.
CourseNana.COM

  • You may only construct gadgets from file rtarget with addresses ranging between those for func- tions start_farm and end_farm.

Target Programs CourseNana.COM

Both CTARGET and RTARGET read strings from standard input. They do so with the function getbuf defined below: CourseNana.COM

2 CourseNana.COM

1 unsigned getbuf() 2{ CourseNana.COM

  1. 3  char buf[BUFFER_SIZE];
  2. 4  Gets(buf);
  3. 5  return 1;

6} CourseNana.COM

The function Gets is similar to the standard library function gets—it reads a string from standard input (terminated by ‘\n’ or end-of-file) and stores it (along with a null terminator) at the specified destination. In this code, you can see that the destination is an array buf, declared as having BUFFER_SIZE bytes. At the time your targets were generated, BUFFER_SIZE was a compile-time constant specific to your version of the programs. CourseNana.COM

Functions Gets() and gets() have no way to determine whether their destination buffers are large enough to store the string they read. They simply copy sequences of bytes, possibly overrunning the bounds of the storage allocated at the destinations. CourseNana.COM

If the string typed by the user and read by getbuf is sufficiently short, it is clear that getbuf will return 1, as shown by the following execution examples: CourseNana.COM

unix> ./ctarget
Cookie: 0x1a7dd803
Type string: Keep it short!
No exploit. Getbuf returned 0x1 Normal return
CourseNana.COM

Typically an error occurs if you type a long string: CourseNana.COM

unix> ./ctarget
Cookie: 0x1a7dd803
Type string: This is not a very interesting string, but it has the property ... Ouch!: You caused a segmentation fault!
Better luck next time
CourseNana.COM

(Note that the value of the cookie shown will differ from yours.) Program RTARGET will have the same behavior. As the error message indicates, overrunning the buffer typically causes the program state to be corrupted, leading to a memory access error. Your task is to be more clever with the strings you feed CTARGET and RTARGET so that they do more interesting things. These are called exploit strings. CourseNana.COM

Both CTARGET and RTARGET take several different command line arguments: CourseNana.COM

-h: Print list of possible command line arguments
-q: Don’t send results to the grading server
-i FILE: Supply input from a file, rather than from standard input CourseNana.COM

Phase Program Level Method Function Points CourseNana.COM

  1. 1  CTARGET
  2. 2  CTARGET
  3. 3  CTARGET
  4. 4  RTARGET
  5. 5  RTARGET

1 CI 2 CI 3 CI 2 ROP 3 ROP CourseNana.COM

touch1 10 touch2 15 touch3 15 touch2 25 touch3 5 CourseNana.COM

  CourseNana.COM

CI: Code injection ROP: Return-oriented CourseNana.COM

programming CourseNana.COM

Figure 1: Summary of attack lab phases CourseNana.COM

Your exploit strings will typically contain byte values that do not correspond to the ASCII values for printing characters. The program HEX2RAW will enable you to generate these raw strings. See Appendix A for more information on how to use HEX2RAW. CourseNana.COM

Important points: CourseNana.COM

Your exploit string must not contain byte value 0x0a at any intermediate position, since this is the ASCII code for newline (‘\n’). When Gets encounters this byte, it will assume you intended to terminate the string. CourseNana.COM

HEX2RAW expects two-digit hex values separated by one or more white spaces. So if you want to create a byte with a hex value of 0, you need to write it as 00. To create the word 0xdeadbeef you should pass “ef be ad de” to HEX2RAW (note the reversal required for little-endian byte ordering). CourseNana.COM

When you have correctly solved one of the levels, your target program will automatically send a notification to the grading server. For example: CourseNana.COM

unix> ./hex2raw < ctarget.l2.txt | ./ctarget
Cookie: 0x1a7dd803
Type string:Touch2!: You called touch2(0x1a7dd803) Valid solution for level 2 with target ctarget
PASSED: Sent exploit string to server to be validated. NICE JOB!
CourseNana.COM

The server will test your exploit string to make sure it really works, and it will update the Attacklab score- board page indicating that your userid (listed by your target number for anonymity) has completed this phase. CourseNana.COM

Unlike the Bomb Lab, there is no penalty for making mistakes in this lab. Feel free to fire away at CTARGET and RTARGET with any strings you like. CourseNana.COM

IMPORTANT NOTE: You can work on your solution on any Linux machine, but in order to submit your solution, you will need to be on fourier.cs.iit.edu. CourseNana.COM

Figure 1 summarizes the five phases of the lab. As can be seen, the first three involve code-injection (CI) attacks on CTARGET, while the last two involve return-oriented-programming (ROP) attacks on RTARGET. CourseNana.COM

4 Part I: Code Injection Attacks CourseNana.COM

For the first three phases, your exploit strings will attack CTARGET. This program is set up in a way that the stack positions will be consistent from one run to the next and so that data on the stack can be treated as executable code. These features make the program vulnerable to attacks where the exploit strings contain the byte encodings of executable code. CourseNana.COM

4.1 Level 1 CourseNana.COM

For Phase 1, you will not inject new code. Instead, your exploit string will redirect the program to execute an existing procedure. CourseNana.COM

Function getbuf is called within CTARGET by a function test having the following C code: CourseNana.COM

1 void test() 2{ CourseNana.COM

  1. 3  int val;
  2. 4  val = getbuf();
  3. 5  printf("No exploit. Getbuf returned 0x%x\n", val);

6} CourseNana.COM

When getbuf executes its return statement (line 5 of getbuf), the program ordinarily resumes execution within function test (at line 5 of this function). We want to change this behavior. Within the file ctarget, there is code for a function touch1 having the following C representation: CourseNana.COM

1 void touch1() 2{ CourseNana.COM

  1. 3  vlevel = 1; /* Part of validation protocol */
  2. 4  printf("Touch1!: You called touch1()\n");
  3. 5  validate(1);
  4. 6  exit(0);

7} CourseNana.COM

Your task is to get CTARGET to execute the code for touch1 when getbuf executes its return statement, rather than returning to test. Note that your exploit string may also corrupt parts of the stack not directly related to this stage, but this will not cause a problem, since touch1 causes the program to exit directly. CourseNana.COM

Some Advice: CourseNana.COM

  • All the information you need to devise your exploit string for this level can be determined by exam-

iningadisassembledversionofCTARGET.Useobjdump -dtogetthisdissembledversion. CourseNana.COM

  • The idea is to position a byte representation of the starting address for touch1 so that the ret

instruction at the end of the code for getbuf will transfer control to touch1. CourseNana.COM

  • Be careful about byte ordering.

4.2 CourseNana.COM

3 4 5 6 7 8 9 CourseNana.COM

10 11 12 CourseNana.COM

vlevel = 2; /* Part of validation protocol */ if (val == cookie) { CourseNana.COM

printf("Touch2!: You called touch2(0x%.8x)\n", val); CourseNana.COM

        validate(2); CourseNana.COM

    } else { CourseNana.COM

printf("Misfire: You called touch2(0x%.8x)\n", val); CourseNana.COM

fail(2); } CourseNana.COM

exit(0); } CourseNana.COM

You might want to use GDB to step the program through the last few instructions of getbuf to make sure it is doing the right thing. CourseNana.COM

The placement of buf within the stack frame for getbuf depends on the value of compile-time constant BUFFER_SIZE, as well the allocation strategy used by GCC. You will need to examine the disassembled code to determine its position. CourseNana.COM

Level 2 CourseNana.COM

Phase 2 involves injecting a small amount of code as part of your exploit string.
Within the file
ctarget there is code for a function touch2 having the following C representation: CourseNana.COM

1 void touch2(unsigned val) 2{ CourseNana.COM

Your task is to get CTARGET to execute the code for touch2 rather than returning to test. In this case, however, you must make it appear to touch2 as if you have passed your cookie as its argument. CourseNana.COM

Some Advice: CourseNana.COM

  • You will want to position a byte representation of the address of your injected code in such a way that

ret instruction at the end of the code for getbuf will transfer control to it. CourseNana.COM

  • Recall that the first argument to a function is passed in register %rdi.
  • Your injected code should set the register to your cookie, and then use a ret instruction to transfer control to the first instruction in touch2.
  • Do not attempt to use jmp or call instructions in your exploit code. The encodings of destination addresses for these instructions are difficult to formulate. Use ret instructions for all transfers of control, even when you are not returning from a call.
  • See the discussion in Appendix B on how to use tools to generate the byte-level representations of instruction sequences.

4.3 Level 3 CourseNana.COM

Phase 3 also involves a code injection attack, but passing a string as argument. CourseNana.COM

Within the file ctarget there is code for functions hexmatch and touch3 having the following C representations: CourseNana.COM

1 /* Compare string to hex represention of unsigned value */ CourseNana.COM

int hexmatch(unsigned val, char *sval) CourseNana.COM

char cbuf[110];
/* Make position of check string unpredictable */ char *s = cbuf + random() % 100;
sprintf(s, "%.8x", val);
return strncmp(sval, s, 9) == 0;
CourseNana.COM

void touch3(char *sval) CourseNana.COM

{ CourseNana.COM

vlevel = 3; /* Part of validation protocol */ if (hexmatch(cookie, sval)) { CourseNana.COM

        printf("Touch3!: You called touch3(\"%s\")\n", sval); CourseNana.COM

        validate(3); CourseNana.COM

    } else { CourseNana.COM

        printf("Misfire: You called touch3(\"%s\")\n", sval); CourseNana.COM

fail(3); } CourseNana.COM

exit(0); } CourseNana.COM

Your task is to get CTARGET to execute the code for touch3 rather than returning to test. You must make it appear to touch3 as if you have passed a string representation of your cookie as its argument. CourseNana.COM

Some Advice: CourseNana.COM

  • Youwillneedtoincludeastringrepresentationofyourcookieinyourexploitstring.Thestringshould

consist of the eight hexadecimal digits (ordered from most to least significant) without a leading “0x.” CourseNana.COM

  • Recall that a string is represented in C as a sequence of bytes followed by a byte with value 0. Type

man ascii”onanyLinuxmachinetoseethebyterepresentationsofthecharactersyouneed. CourseNana.COM

  • Your injected code should set register %rdi to the address of this string.
  • When functions hexmatch and strncmp are called, they push data onto the stack, overwriting portions of memory that held the buffer used by getbuf. As a result, you will need to be careful where you place the string representation of your cookie.

Figure 2: Setting up sequence of gadgets for execution. Byte value 0xc3 encodes the ret instruction. CourseNana.COM

5 Part II: Return-Oriented Programming CourseNana.COM

Performing code-injection attacks on program RTARGET is much more difficult than it is for CTARGET, because it uses two techniques to thwart such attacks: CourseNana.COM

It uses randomization so that the stack positions differ from one run to another. This makes it impos- sible to determine where your injected code will be located. CourseNana.COM

It marks the section of memory holding the stack as nonexecutable, so even if you could set the program counter to the start of your injected code, the program would fail with a segmentation fault. CourseNana.COM

Fortunately, clever people have devised strategies for getting useful things done in a program by executing existing code, rather than injecting new code. The most general form of this is referred to as return-oriented programming (ROP) [1, 2]. The strategy with ROP is to identify byte sequences within an existing program that consist of one or more instructions followed by the instruction ret. Such a segment is referred to as a gadget. Figure 2 illustrates how the stack can be set up to execute a sequence of n gadgets. In this figure, the stack contains a sequence of gadget addresses. Each gadget consists of a series of instruction bytes, with the final one being 0xc3, encoding the ret instruction. When the program executes a ret instruction starting with this configuration, it will initiate a chain of gadget executions, with the ret instruction at the end of each gadget causing the program to jump to the beginning of the next. CourseNana.COM

A gadget can make use of code corresponding to assembly-language statements generated by the compiler, especially ones at the ends of functions. In practice, there may be some useful gadgets of this form, but not enough to implement many important operations. For example, it is highly unlikely that a compiled function would have popq %rdi as its last instruction before ret. Fortunately, with a byte-oriented instruction set, such as x86-64, a gadget can often be found by extracting patterns from other parts of the instruction byte sequence. CourseNana.COM

For example, one version of rtarget contains code generated for the following C function: void setval_210(unsigned *p) CourseNana.COM

{
*p = 3347663060U;
CourseNana.COM

} CourseNana.COM

The chances of this function being useful for attacking a system seem pretty slim. But, the disassembled machine code for this function shows an interesting byte sequence: CourseNana.COM

0000000000400f15 <setval_210>:
400f15: c7 07 d4 48 89 c7 movl $0xc78948d4,(%rdi) 400f1b: c3 retq
CourseNana.COM

The byte sequence 48 89 c7 encodes the instruction movq %rax, %rdi. (See Figure 3A for the encodings of useful movq instructions.) This sequence is followed by byte value c3, which encodes the ret instruction. The function starts at address 0x400f15, and the sequence starts on the fourth byte of the function. Thus, this code contains a gadget, having a starting address of 0x400f18, that will copy the 64-bit value in register %rax to register %rdi. CourseNana.COM

Your code for RTARGET contains a number of functions similar to the setval_210 function shown above in a region we refer to as the gadget farm. Your job will be to identify useful gadgets in the gadget farm and use these to perform attacks similar to those you did in Phases 2 and 3. CourseNana.COM

Important: The gadget farm is demarcated by functions start_farm and end_farm in your copy of rtarget. Do not attempt to construct gadgets from other portions of the program code. CourseNana.COM

5.1 Level 2 CourseNana.COM

For Phase 4, you will repeat the attack of Phase 2, but do so on program RTARGET using gadgets from your gadget farm. You can construct your solution using gadgets consisting of the following instruction types, and using only the first eight x86-64 registers (%rax%rdi). CourseNana.COM

movq : The codes for these are shown in Figure 3A. CourseNana.COM

popq : The codes for these are shown in Figure 3B. CourseNana.COM

ret : This instruction is encoded by the single byte 0xc3. CourseNana.COM

nop : This instruction (pronounced “no op,” which is short for “no operation”) is encoded by the single byte 0x90. Its only effect is to cause the program counter to be incremented by 1. CourseNana.COM

Some Advice: CourseNana.COM

  • All the gadgets you need can be found in the region of the code for rtarget demarcated by the

functions start_farm and mid_farm. CourseNana.COM

  • You can do this attack with just two gadgets.
  • When a gadget uses a popq instruction, it will pop data from the stack. As a result, your exploit string will contain a combination of gadget addresses and data.

A. Encodings of movq instructions CourseNana.COM

movq S, D Source CourseNana.COM

S %rax %rcx %rax 4889c0 4889c1 %rcx 4889c8 4889c9 %rdx 4889d0 4889d1 %rbx 4889d8 4889d9 %rsp 4889e0 4889e1 %rbp 4889e8 4889e9 %rsi 4889f0 4889f1 %rdi 4889f8 4889f9 CourseNana.COM

B. Encodings of popq instructions Operation CourseNana.COM

Destination D
%rdx %rbx %rsp %rbp %rsi %rdi CourseNana.COM

4889c2 4889c3 4889c4 4889c5 4889c6 4889c7 4889ca 4889cb 4889cc 4889cd 4889ce 4889cf 4889d2 4889d3 4889d4 4889d5 4889d6 4889d7 4889da 4889db 4889dc 4889dd 4889de 4889df 4889e2 4889e3 4889e4 4889e5 4889e6 4889e7 4889ea 4889eb 4889ec 4889ed 4889ee 4889ef 4889f2 4889f3 4889f4 4889f5 4889f6 4889f7 4889fa 4889fb 4889fc 4889fd 4889fe 4889ff CourseNana.COM

Register R
%rax %rcx %rdx %rbx %rsp %rbp %rsi %rdi CourseNana.COM

popqR 58 59 5a 5b 5c 5d 5e 5f C. Encodings of movl instructions CourseNana.COM

movl S, D
Source Destination D CourseNana.COM

S %eax %ecx %edx %ebx %esp %ebp %esi %edi %eax 89c0 89c1 89c2 89c3 89c4 89c5 89c6 89c7 %ecx 89c8 89c9 89ca 89cb 89cc 89cd 89ce 89cf %edx 89d0 89d1 89d2 89d3 89d4 89d5 89d6 89d7 %ebx 89d8 89d9 89da 89db 89dc 89dd 89de 89df %esp 89e0 89e1 89e2 89e3 89e4 89e5 89e6 89e7 %ebp 89e8 89e9 89ea 89eb 89ec 89ed 89ee 89ef %esi 89f0 89f1 89f2 89f3 89f4 89f5 89f6 89f7 %edi 89f8 89f9 89fa 89fb 89fc 89fd 89fe 89ff CourseNana.COM

D. Encodings of 2-byte functional nop instructions CourseNana.COM

Operation Register R
%al %cl %dl %bl CourseNana.COM

andb R,R 20c0 20c9 20d2 20db orb R, R 08 c0 08 c9 08 d2 08 db cmpb R,R 38c0 38c9 38d2 38db testb R,R 84c0 84c9 84d2 84db CourseNana.COM

Figure 3: Byte encodings of instructions. All values are shown in hexadecimal. CourseNana.COM

5.2 Level 3 CourseNana.COM

Before you take on the Phase 5, pause to consider what you have accomplished so far. In Phases 2 and 3, you caused a program to execute machine code of your own design. If CTARGET had been a network server, you could have injected your own code into a distant machine. In Phase 4, you circumvented two of the main devices modern systems use to thwart buffer overflow attacks. Although you did not inject your own code, you were able inject a type of program that operates by stitching together sequences of existing code. You have also gotten 65/70 points for the lab. That’s a good score. If you have other pressing obligations consider stopping right now. CourseNana.COM

Phase 5 requires you to do an ROP attack on RTARGET to invoke function touch3 with a pointer to a string representation of your cookie. That may not seem significantly more difficult than using an ROP attack to invoke touch2, except that we have made it so. Moreover, Phase 5 counts for only 5 points, which is not a true measure of the effort it will require. Think of it as more an extra credit problem for those who want to go beyond the normal expectations for the course. CourseNana.COM

To solve Phase 5, you can use gadgets in the region of the code in rtarget demarcated by functions start_farm and end_farm. In addition to the gadgets used in Phase 4, this expanded farm includes the encodings of different movl instructions, as shown in Figure 3C. The byte sequences in this part of the farm also contain 2-byte instructions that serve as functional nops, i.e., they do not change any register or memoryvalues.Theseincludeinstructions,showninFigure3D,suchasandb %al,%al,thatoperateon the low-order bytes of some of the registers but do not change their values. CourseNana.COM

Some Advice: CourseNana.COM

You’ll want to review the effect a movl instruction has on the upper 4 bytes of a register, as is described on page 183 of the text. CourseNana.COM

The official solution requires eight gadgets (not all of which are unique). Good luck and have fun! CourseNana.COM

Get in Touch with Our Experts

WeChat (微信) WeChat (微信)
Whatsapp WhatsApp
CS351代写,Systems Programming代写,IIT代写,Cache simulation &amp;amp; optimization代写,Attack Lab代写,CSAPP代写,ASM代写,Assembly代写,CS351代编,Systems Programming代编,IIT代编,Cache simulation &amp;amp; optimization代编,Attack Lab代编,CSAPP代编,ASM代编,Assembly代编,CS351代考,Systems Programming代考,IIT代考,Cache simulation &amp;amp; optimization代考,Attack Lab代考,CSAPP代考,ASM代考,Assembly代考,CS351help,Systems Programminghelp,IIThelp,Cache simulation &amp;amp; optimizationhelp,Attack Labhelp,CSAPPhelp,ASMhelp,Assemblyhelp,CS351作业代写,Systems Programming作业代写,IIT作业代写,Cache simulation &amp;amp; optimization作业代写,Attack Lab作业代写,CSAPP作业代写,ASM作业代写,Assembly作业代写,CS351编程代写,Systems Programming编程代写,IIT编程代写,Cache simulation &amp;amp; optimization编程代写,Attack Lab编程代写,CSAPP编程代写,ASM编程代写,Assembly编程代写,CS351programming help,Systems Programmingprogramming help,IITprogramming help,Cache simulation &amp;amp; optimizationprogramming help,Attack Labprogramming help,CSAPPprogramming help,ASMprogramming help,Assemblyprogramming help,CS351assignment help,Systems Programmingassignment help,IITassignment help,Cache simulation &amp;amp; optimizationassignment help,Attack Labassignment help,CSAPPassignment help,ASMassignment help,Assemblyassignment help,CS351solution,Systems Programmingsolution,IITsolution,Cache simulation &amp;amp; optimizationsolution,Attack Labsolution,CSAPPsolution,ASMsolution,Assemblysolution,