Project 4
How to Succeed on This Project
-
Read the entirety of this spec in detail. It contains a lot of information that will make the project go more smoothly if you take the time to understand it before starting your project. This document contains a number of helpful pointers that will save you pain further down the line.
-
Know the purpose of each file in the starter code. You will only need to modify one file to complete the project, but you should read this specification to understand why the other files are present, and you should read over those other files as needed. In particular, understand which files you are not allowed to modify to avoid autograder issues.
-
Know how to run your code independently of the Makefiles we provide. As with the last project, the test results alone are not enough to tell you what is going wrong with your code. You will need to be able to test your code and inspect it with gdb to figure out precisely where it may be producing undesirable results.
-
Expect to get stuck, and exercise patience and persistence. We are happy to help in office hours but will not give out direct solutions.
-
Start early. Give yourself time to get stuck, discover insights and solutions to something that initially seems challenging, and overcome obstacles. It can be hard to think methodically when faced with the pressure of an imminent deadline.
-
If you have questions, your best option is to ask in a public Piazza post. This gets your question in front of as many people as quickly as possible and also lets others benefit by seeing the answer to your question.
-
Familiarize yourself with the late submission policy detailed in the syllabus so that you are not caught off-guard. No submissions are accepted more than 48 hours after the deadline. Project extensions are not granted except in the case of a documented and acute emergency.
Introduction
In this project, you will implement a simple memory allocator built around doubly-linked lists. This exercise is related to some of the final topics covered in lecture and will also give you greater insight into how the C library functions malloc and free work as well as why frequent allocation is generally slow and can degrade the performance of your code.
"EL Malloc" implements a simple, explicit list memory allocator. This manages heap memory in doubly-linked lists of Available and Used memory blocks to provide the functions el_malloc() and
to its users. It could be extended with some work to be a drop-in replacement for the original and free() functions.
Grading Criteria
Credit for this assignment will be awarded based on the following categories:
Project Quiz (10%): A quiz on this assignment specification and the starter code has been posted
to Canvas. Complete the quiz by the due date. The quiz is to be completed individually but
multiple attempts are allowed.
Automated Testing of EL Malloc Implementation (55%): We have provided several tests along
with instructions on how to run these tests from the command line. To pass the tests, you will
need to ensure that your code properly compiles and runs according to the descriptions given. In
particular, the output from your program must exactly match the expected output, so make sure to
verify that your code works as expected.
Manual Inspection (35%): Graders will be manually reviewing your code to check for certain features -- mainly that you properly implemented the expected solution and that your code adheres to a reasonable structure and style. You should add comments to your code where you think it is appropriate.
Automated Tests
As in Project 1 and our Lab assignments, we have provided a framework to run the automated tests for Project 4 using make . This will be used to run several test cases against your EL Malloc implementation and compare the output produced by your program to the output expected for each test case.
The command will run all tests against your code
You can always use the argument, e.g., make test testnum=5 to individually run a
specific test case.
As always, know how to run your code independently of these tests and the Makefile we
provide. This is necessary for you to gain a full understanding of what your code is really doing
and to use tools like gdb in your work.
Manual Grading Criteria
el_get_header() and el_block_below()
el_free()
malloc()
make test
testnum
2pts: Use of provided macros for pointer arithmetic
2pts: Correct use of to account for data sizes
1 pt: checks for beginning of heap and returns NULL if necessary
el_add_block_front() and el_remove_block()
2 pts: Sensible use of pointers prev and next to link/unlink nodes efficiently. No looping need.
2 pts: Correct update of list length and total bytes
1 pt: Accounts for EL_BLOCK_OVERHEAD when updating bytes
el_merge_block_with_above()
1 pt: Checks for NULL argument, which results in no changes
2 pts: Checks if both blocks are in AVAILABLE state
3 pts: Use of el_block_above() to find block above
2 pts: Clear updates to size of lower block header and higher block footer
2 pts: Moves both blocks out of available list, adds merged block to front of available list
Starter Code
Download the zip file linked at the top of this page to obtain the starter code for this project. After unzipping, you should see the following files:
sizeof
el_block_below()
el_find_first_avail()
File Purpose
Provided Provided
Testing
Notes
Build file to compile all programs Memory Allocator Header File
Makefile
el_malloc.h
el_demo.c
Provided Memory Allocator Demo
main()
Contains Memory Allocator Functions For You to Complete
el_malloc.c
Edit
test_el_malloc.c
Testing program for EL Malloc
File Purpose
Testing Testing
EL Malloc
A memory allocator is a small system which manages heap memory, sometimes referred to as the "data" segment of a program. This portion of program memory is a linear set of addresses that form a large block which can expand at runtime by making requests to the operating system. Solving the allocation problem forms the backbone of what malloc()/free() do by keeping track of the space used and released by a user program. Allocators also see use in garbage-collected languages like Java where there are no explicit free() calls, but the allocator must still find available space for new objects.
One simple way to implement an allocator is to overlay linked lists on the heap which track at a minimum the available chunks of memory and possibly also the used chunks. This comes with a cost: some of the bytes of memory in the heap are no longer available for the user program but are instead used for the allocator's bookkeeping.
In this project, an Explicit List allocator is developed, thus the name of the system: el_malloc . It uses two lists to track memory in the heap:
The Available List keeps track of blocks of memory that can be used to answer calls to el_malloc()
The Used List keeps track of blocks that have been returned by el_malloc() and should not be given out to the user again until they are el_free() 'd.
Most operations boil down to manipulating these two lists in some form.
Performing an allocation with ptr = el_malloc(size); searches the Available List for a block with sufficient size. That block is split into two blocks. One block represents the answer to the request and is given size bytes; it is then moved to the Used List. The second block comprises the unused remainder of the original chunk of space and remains on the Available List.
Performing a deallocation with el_free(ptr); moves the block referenced by ptr from the Used List to the Available List. To prevent fragmentation of memory, the newly available block is merged with adjacent available blocks if possible.
EL Malloc Data Structures
Several data structures defined in el_malloc.h should be studied so that one is acquainted with their design and intended use. The following sections outline many of these and show diagrams to indicate the transformations that the required functions should carry out.
Block Header/Footer
Notes
Each block of memory tracked by EL Malloc is preceded and succeeded by some bytes of memory for bookkeeping. These are referred to as the block "header" and "footer" and are represented by the
el_blockhead_t and el_blockfoot_t structs.
// type which is a "header" for a block of memory; contains info on
// size, whether the block is available or in use, and links to the
// next/prev blocks in a doubly linked list. This data structure
// appears immediately before a block of memory that is tracked by the
// allocator.
typedef struct block {
size_t size;
char state;
struct block *next;
struct block *prev;
} el_blockhead_t;
// number of bytes of memory in this block
// either EL_AVAILABLE or EL_USED
// pointer to next block in same list
// pointer to previous block in same list
// Type for the "footer" of a block; indicates size of the preceding
// block so that its header el_blockhead_t can be found with pointer
// arithmetic. This data appears immediately after an area of memory
// that may be used by a user or is free. Immediately after it is
// either another header (el_blockhead_t) or the end of the heap.
typedef struct {
size_t size;
} el_blockfoot_t;
As indicated, the blocks use doubly linked nodes in the header which will allow easy re- arrangement of the list.
A picture of an individual block with its header, footer, and user data area is shown below.
61 u 61
retnioP
resU
retooF redaeH
ezis
ezis
etats
verp
txen
desu setyb 61
Footers and Blocks Above/Below
One might wonder why a footer appears at the end of each block. In tracking blocks, there will arise the need to inspect a block that is located immediately before a given block in memory (not the previous element in the linked list). The footer enables this by tracking the size of the user block of memory immediately preceding that footer.
This is illustrated in the diagram below.
ezis
ezis
ezis
etats
verp
retooF
redaeH retooF
redaeH
a 61
desu setyb 8 elbaliava fo setyb 61
This operation is implemented in the function el_block_below(block) . The similar operation el_block_above(block) finds the next header immediately following a block in memory.
The following functions use pointer arithmetic to determine block locations from a provided pointer.
These functions benefit from macros defined in el_malloc.h that are useful for doing pointer operations involving bytes.
el_blockfoot_t *el_get_footer(el_blockhead_t *block);
el_blockhead_t *el_get_header(el_blockfoot_t *foot);
el_blockhead_t *el_block_above(el_blockhead_t *block);
el_blockhead_t *el_block_below(el_blockhead_t *block);
// macro to add a byte offset to a pointer, arguments are a pointer
// and a number of bytes (usually size_t)
#define PTR_PLUS_BYTES(ptr, off) ((void *) (((size_t) (ptr)) + ((size_t) (off))))
// macro to subtract a byte offset from a pointer, arguments are a pointer
// and a number of bytes (usually size_t)
#define PTR_MINUS_BYTES(ptr, off) ((void *) (((size_t) (ptr)) - ((size_t) (off))))
// macro to subtract one pointer from another to computer their "distance"
// apart in bytes
#define PTR_MINUS_PTR(ptr,ptq) ((long) (((size_t) (ptr)) - ((size_t) (ptq))))
Block Lists and Global Control
The main purpose of the memory allocator is to track the available and used blocks in explicit linked lists. This allows chunks of used and available memory to be distributed throughout the heap. Below are the data structures that track these lists and the global control data structure which houses information for the entire heap.
// Type for a list of blocks; doubly linked with a fixed
// "dummy" node at the beginning and end which do not contain any
// data. List tracks its length and number of bytes in use.
typedef struct {
el_blockhead_t beg_actual;
el_blockhead_t end_actual;
el_blockhead_t *beg;
// fixed node at beginning of list; state is EL_BEGIN_BLOCK
// fixed node at end of list; state is EL_END_BLOCK
void *heap_start;
void *heap_end;
bounds
// pointer to where the heap starts
// pointer to where the heap ends; this memory address is out of
// number of bytes currently in the heap
size_t heap_bytes;
el_blocklist_t avail_actual; // space for the available list data
el_blocklist_t used_actual;
el_blocklist_t *avail;
el_blocklist_t *used;
} el_ctl_t;
// space for the used list data
// pointer to avail_actual
// pointer to used_actual
The following diagram shows some of the structure induced by use of a doubly linked lists overlaid onto the heap. The global control structure el_ctl has two lists for available and used space, respectively.
dne>-liava.ltc_le
geb>-liava.ltc_le
htgnel>-liava.ltc_le
dne>-desu.ltc_le
geb>-desu.ltc_le
htgnel>-desu.ltc_le
ecapS desU
ecapS elbaliavA
t_toofkcolb_le
t_daehkcolb_le
4
3
001 a 06
u 02
u 61
a 83
u 22
a 652 u
sretnioP resU
dne_paeh.ltc_le
trats_paeh.ltc_le
lortnoc paeh labolg :ltc_le
The following three functions respectively initialize the global control structures, print stats on the heap, and clean up at the end of execution.
Pointers and "Actual" Space
int el_init(int max_bytes);
void el_print_stats();
void el_cleanup();
In several structures, there are pointers named xxx and structs named xxx_actual . For example, in el_blocklist_t :
The intent here is that there will always be a node at the beginning of the doubly linked list, just to make the programming easier, so it makes sense to have an actual struct beg_actual present. However, when working with the list, the address of the beginning node is often referenced, making
beg useful. In any case, beg will be initialized to &beg_actual as appears in el_init_blocklist() in el_malloc.c .
Similarly, since there will always be an Available List, el_ctl_t has both an avail pointer to the list and avail_actual which is the struct for the list.
Doubly-Linked List Operations
A large number of operations in EL Malloc boil down to doubly linked list operations. This includes
Unlinking nodes from the middle of list during el_free()
Adding nodes to the beginning of a headed list (allocation and free)
Traversing the list to print and search for available blocks
Recall that unlinking a node from a doubly linked list involves modifying the previous and next nodes as in the following.
while adding a new node to the front is typically accomplished via
You may wish to review doubly linked list operations and do some reading on lists with "dummy" nodes at the beginning and ending if these concepts are rusty.
typedef struct {
...
el_blockhead_t beg_actual;
el_blockhead_t *beg;
...
} el_blocklist_t;
// fixed node at beginning of list; state is EL_BEGIN_BLOCK
// pointer to beg_actual
void el_init_blocklist(el_blocklist_t *list){
list->beg = &(list->beg_actual);
list->beg->state = EL_BEGIN_BLOCK;
list->beg->size = EL_UNINITIALIZED;
... }
node->prev->next = node->next;
node->next->prev = node->prev;
node->prev = list->beg;
node->next = list->beg->next;
node->prev->next = node;
node->next->prev = node;
The following functions pertain to block list operations.
Allocation via Block Splitting
The basic operation of giving out memory on a call to el_malloc(size) involves finding an Available Block with enough bytes for both the requested amount of memory and a new header/footer combination. The current requirement is that a block always gets split on an allocation, though a straightforward optimization would be to refrain from splitting in the case of a close or exact size match.
This process is demonstrated in the diagram below, in which a request for some number of bytes has been made. Before carrying out the steps shown in the diagram, your code must search the Available List for a block with enough space. Once found, the block is split into the portion that will be used and the remaining portion (which remains available and can be given out on future calls to el_malloc() ). A pointer to the user area immediately after the el_blockhead_t is returned.
void el_init_blocklist(el_blocklist_t *list);
void el_print_blocklist(el_blocklist_t *list);
void el_add_block_front(el_blocklist_t *list, el_blockhead_t *block);
void el_remove_block(el_blocklist_t *list, el_blockhead_t *block);
dne>-liava.ltc_le
geb>-liava.ltc_le
htgnel>-liava.ltc_le
dne>-desu.ltc_le
geb>-desu.ltc_le
htgnel>-desu.ltc_le
ecapS desU
ecapS elbaliavA
t_toofkcolb_le
t_daehkcolb_le
... ...
a ...
8
5
... 003
setyb 003
setyb daehrevo 04+06
htiw kcolb elbaliava tsrfi
tilpSotkcolBdnuof,setyb06roftseuqeR
dne>-liava.ltc_le
geb>-liava.ltc_le
htgnel>-liava.ltc_le
dne>-desu.ltc_le
geb>-desu.ltc_le
htgnel>-desu.ltc_le
... 002
setyb 002 a 06
etyb 06
... ...
u ...
8
5
ecaps detseuqer htiw
resu ot denruter retniop
setyb daehrevo 04+06
htiw kcolb elbaliava tsrfi
stsil htob fo tnorf ta dedda kcolB tilpS :gniniameR dna desU otni kcolB tilpS
The following functions pertain to the location and splitting of blocks in the available list to fulfill allocation requests.
Freeing Blocks and Merging
A call to el_free() passes in a pointer to a region of user memory that was previously given out on a call to . Immediately preceding this chunk of memory should be an el_blockhead_t instance, which can be found through pointer arithmetic.
In order to prevent memory from becoming continually divided into smaller blocks, on freeing memory
the system checks to see if adjacent blocks can be merged together to form a single, larger block of
el_blockhead_t *el_find_first_avail(size_t size);
el_blockhead_t *el_split_block(el_blockhead_t *block, size_t new_size);
void *el_malloc(size_t nbytes);
el_malloc()
available memory. Keep in mind that the blocks that can be merged are adjacent in memory, not
next/previous in some linked list. Adjacent blocks can be located using el_block_above() and
el_block_below() .
To merge, the adjacent blocks must both be Available (not Used). A free can then have several cases.
1. The freed block cannot be merged with any other blocks
2. The freed block can be merged with only the block above it
3. The freed block can be merged with only the block below it
4. The freed block can be merged with both adjacent blocks
The diagrams below show two of these cases.5 htgnel> liava ltc le
dne>-liava.ltc_le
geb>-liava.ltc_le
htgnel>-liava.ltc_le
dne>-desu.ltc_le
geb>-desu.ltc_le
htgnel>-desu.ltc_le
...
002 setyb 051
u 06 setyb 06 u 09
setyb 09 u ...
...
d'eerf eb ot kcolb
8
5
)EROFEB( elbissop gnigrem oN :1 esaC eerF
dne>-liava.ltc_le
geb>-liava.ltc_le
htgnel>-liava.ltc_le
dne>-desu.ltc_le
geb>-desu.ltc_le
htgnel>-desu.ltc_le
...
002 setyb 051
u 06 setyb 06 a 09
setyb 09 u ...
...
d'eerf eb ot kcolb
8
5
)RETFA( elbissop gnigrem oN :1 esaC eerF
dne>-liava.ltc_le
geb>-liava.ltc_le
htgnel>-liava.ltc_le
dne>-desu.ltc_le
geb>-desu.ltc_le
htgnel>-desu.ltc_le
... 002 setyb 051 u 06
ecapS desU
ecapS elbaliavA
t_toofkcolb_le
t_daehkcolb_le
setyb 06 u
...
09
setyb 09
a ...
...
d'eerf eb ot kcolb
)EROFEB(wolebkcolbhtiwegreM:2esaCeerF
8
5
tsiLelbaliavAfotnorfotdeddaedondegreM.)RETFA(wolebkcolbhtiwegreM:2esaCeerF
With careful use of the functions listed below and correct handling of NULL arguments, all four cases can be handled with very little code. Keep in mind that el_block_above() and el_block_below() should return NULL if there is no block above or below, as those would now be areas of memory outside the bounds of the heap.
EL Malloc Library Structure
Study the files el_malloc.h and el_malloc.c especially carefully, as they define initial placeholder versions of the functions you must implement. all functions marked as TODO in the corresponding code comments must be completed for full credit on this project. The comments for each function also summarize what that function should do -- read them carefully so you understand the work expected of you.
EL Malloc Demo
The file el_demo.c implements a simple demonstration program that uses the EL Malloc library. You can use this program for testing purposes, but note that it does not serve as a complete substitute for the automated tests we will be running as the basis for much of your project grade.
dne>-desu.ltc_le
geb>-desu.ltc_le
htgnel>-desu.ltc_le
dilav regnol on daeh :d'eerf eb ot kcolb
... 002 setyb 051 u 091 setyb 091 = 04+06+09 a ...
... ...
8
5
dne>-liava.ltc_le
geb>-liava.ltc_le
htgnel>-liava.ltc_le
AVAILABLE LIST: {length: 1 bytes: 4096}
[ 0] head @ 0x600000000000 {state: a size: 4056}
foot @ 0x600000000ff8 {size: 4056}
USED LIST: {length: 0 bytes: 0}
MALLOC 3
HEAP STATS (overhead per node: 40)
heap_start: 0x600000000000
heap_end: 0x600000001000
total_bytes: 4096
AVAILABLE LIST: {length: 1 bytes: 3644}
[ 0] head @ 0x6000000001c4 {state: a size: 3604}
foot @ 0x600000000ff8 {size: 3604}
USED LIST: {length: 3 bytes: 452}
[ 0] head @ 0x600000000100 {state: u size: 156}
foot @ 0x6000000001bc {size: 156}
[1] head @ 0x6000000000a8 {state: u size: 48}
foot @ 0x6000000000f8 {size: 48}
[ 2] head @ 0x600000000000 {state: u size: 128}
foot @ 0x6000000000a0 {size: 128}
heap_end: 0x600000001000
total_bytes: 4096
AVAILABLE LIST: {length: 1 bytes: 3478}
[ 0] head @ 0x60000000026a {state: a size: 3438}
foot @ 0x600000000ff8 {size: 3438}
USED LIST: {length: 5 bytes: 618}
[ 0] head @ 0x600000000202 {state: u size: 64}
foot @ 0x600000000262 {size: 64}
[1] head @ 0x6000000001c4 {state: u size: 22}
foot @ 0x6000000001fa {size: 22}
[ 2] head @ 0x600000000100 {state: u size: 156}
foot @ 0x6000000001bc {size: 156}
[3] head @ 0x6000000000a8 {state: u size: 48}
foot @ 0x6000000000f8 {size: 48}
[ 4] head @ 0x600000000000 {state: u size: 128}
foot @ 0x6000000000a0 {size: 128}
POINTERS
p5: 0x600000000222
p4: 0x6000000001e4
p3: 0x600000000120
p2: 0x6000000000c8
p1: 0x600000000020
FREE 1
HEAP STATS (overhead per node: 40)
heap_start: 0x600000000000
heap_end: 0x600000001000
total_bytes: 4096
AVAILABLE LIST: {length: 2 bytes: 3646}
[ 0] head @ 0x600000000000 {state: a size: 128}
foot @ 0x6000000000a0 {size: 128}
[ 1] head @ 0x60000000026a {state: a size: 3438}
foot @ 0x600000000ff8 {size: 3438}
USED LIST: {length: 4 bytes: 450}
[ 0] head @ 0x600000000202 {state: u size: 64}
foot @ 0x600000000262 {size: 64}
[ 1] head @ 0x6000000001c4 {state: u size: 22}
foot @ 0x6000000001fa {size: 22}
[ 2] head @ 0x600000000100 {state: u size: 156}
foot @ 0x6000000001bc {size: 156}
[3] head @ 0x6000000000a8 {state: u size: 48} foot @ 0x6000000000f8 {size: 48}
FREE 3
HEAP STATS (overhead per node: 40)
heap_start: 0x600000000000
heap_end: 0x600000001000
total_bytes: 4096
AVAILABLE LIST: {length: 3 bytes: 3842}
[ 0] head @ 0x600000000100 {state: a size: 156}
foot @ 0x6000000001bc {size: 156}
[ 1] head @ 0x600000000000 {state: a size: 128}
foot @ 0x6000000000a0 {size: 128}
[ 2] head @ 0x60000000026a {state: a size: 3438}
foot @ 0x600000000ff8 {size: 3438}
USED LIST: {length: 3 bytes: 254}
[ 0] head @ 0x600000000202 {state: u size: 64}
foot @ 0x600000000262 {size: 64}
-
[1] head @ 0x6000000001c4 {state: u size: 22}
foot @ 0x6000000001fa {size: 22}
-
[2] head @ 0x6000000000a8 {state: u size: 48}
foot @ 0x6000000000f8 {size: 48}
RE-ALLOC 3,1
HEAP STATS (overhead per node: 40)
heap_start: 0x600000000000
heap_end: 0x600000001000
total_bytes: 4096
AVAILABLE LIST: {length: 3 bytes: 3530}
[ 0] head @ 0x60000000035a {state: a size: 3198}
foot @ 0x600000000ff8 {size: 3198}
[1] head @ 0x600000000148 {state: a size: 84} foot @ 0x6000000001bc {size: 84}
[ 2] head @ 0x600000000000 {state: a size: 128}
foot @ 0x6000000000a0 {size: 128}
USED LIST: {length: 5 bytes: 566}
[ 0] head @ 0x60000000026a {state: u size: 200}
foot @ 0x600000000352 {size: 200}
-
[1] head @ 0x600000000100 {state: u size: 32}
foot @ 0x600000000140 {size: 32}
-
[2] head @ 0x600000000202 {state: u size: 64}
foot @ 0x600000000262 {size: 64}
-
[3] head @ 0x6000000001c4 {state: u size: 22}
foot @ 0x6000000001fa {size: 22}
-
[4] head @ 0x6000000000a8 {state: u size: 48}
foot @ 0x6000000000f8 {size: 48}