CS520
Fall 2010
Programming Assignment 3
Due Sunday October 10


The goal of this assignment is to build a memory allocator that has an associated conservative garbage collector. The memory allocator will be designed and implemented for the Intel IA-32 architecture.

The public interface to the allocator is the following two functions:

memInitialize should be called exactly once at program startup and will initialize the allocator to hold the number of 32-bit words given by its single argument. The argument must be greater than zero.

If memInitialize fails, it returns 0; else it returns 1. (Note: Since memInitialize can only be called once, there is no way to recover from the function failing; all subsequent calls will fail because an earlier call was made.)

memAllocate allocates a contiguous block of 32-bit words of the length given by its first argument. (Note: The length could be zero.) If the allocation fails, it returns 0; else it returns the address of the base of the allocated block. The allocation will consume two more words than the user requests, in order to store control information with the allocated block.

The second argument of memAllocate is either NULL or a pointer to a function that should be called when the block is eventually freed, if it is. (If NULL is passed as the second argument, then nothing special needs to be done if/when the block is freed.) The argument to be passed to the "finalize" function is the base address of the block being freed. A "finalize" function should not call memAllocate. If this happens, memAllocate should print an appropriate error message to stderr and then abort by calling exit(-1).

If a call to memAllocate is made and there is not enough unallocated memory left to fulfill the current request, then a conservative garbage collector will be invoked to try to identify allocated blocks that are no longer in use. The collector will mark all allocated blocks that appear to be in use and then will free the unmarked blocks. The marking phase will start by examining the global data memory and the stack frames for active function invocations. Any allocated block that is pointed to by one of those areas will then be examined to see if it points to any allocated blocks. And then any block that is pointed to from a marked block will then be examined. This process recurses until all blocks that are reachable are marked.

The garbage collector will assume that global memory is bounded by the address of the symbol "__data_start" and the address of the symbol "_end".

The stack area to be examined is delimited by the top of stack at the time that memAllocate is called and by the frame for the topmost frame (the earliest frame created) on the stack. Assume that the topmost frame will have 0 in the "saved EBP" slot.

After the marking phase is done, adjacent blocks that are freed should be coalesced into one block. After freeing and coalescing all garbage blocks, if there is still not a free block big enough to satisfy the current request, then the call to memAllocate fails and it should return 0.

The garbage collector is conservative in that it will assume anything that appears to point to an allocated block actually does point to the block. That is, an integer might coincidentally contain a bit pattern that could be interpreted as a pointer to a block. The collector will go ahead and mark the block. So, the collector is conservative in the sense that any block that is actually pointed to will be marked, but some additional blocks that may not actually be pointed to will also be marked.

The collector will consider memory to be an array of 32-bit words. When scanning global memory or the stack, the collector will begin on a clean 4-byte boundary and will proceed one 32-bit word at a time.

A block should be considered to be pointed to if the base of the block is pointed to, if an internal word of the block is pointed to, or if the word immediately off either end of the block is pointed to.

You should only use malloc in memInitialize. In particular, you cannot use it in memAllocate. Any control information for an allocated block must be stored in headers or footers attached to the block and, as stated above, the control information for a block is limited to two words. You may have a few static global pointer variables that keep track of the location of the heap, but there should be no information in global memory that is specific to a particular allocated (or free) block.

Testing your implementation of the garbage collector will be a challenge. Here are some issues to consider:

This list is most likely not comprehensive, so be sure to do your own thinking and add to this list if necessary.

To support testing and debugging, provide a function with the following prototype:

This function should print information about the current state of the memory allocator to stderr. The following items should be printed in the following order:

  1. Global Memory: First, print a line that says "Global Memory: ", "start = ", the start address of global memory, " end = ", the end address of global memory, " length = ", and the length of global memory. Print the addresses with eight hex digits, even if the leading digits are zeroes. The length is in units of words and should be printed in decimal without leading zeros. Then print a blank line. Then print a line for each word in global memory that might point at an allocated block. This line should should first show the address of the word, then a space, then the contents of the word. Both the address and the contents should be printed with eight hex digits, even if the leading digits are zeroes. These lines should be printed so that the addresses are increasing from one line to the next. If at least one word was printed, then print a blank line at the end.

  2. Stack Memory: Should be printed just like Global Memory, except the head line should say "Stack Memory"

  3. Registers: First, print a line that says "Registers" followed by a blank line. On a single line, then print the contents of the ebx, esi and edi registers, in that order. For each register, print the register name and a space, then the contents using eight hex digits, then either two spaces or an asterisk followed by a single space. Print the asterisk if the register contents might point at an allocated block. After printing the register contents, print a blank line.

  4. Heap: First, print a line that simply says "Heap" followed by a blank line. Then print information about each block of the heap. Print the blocks in increasing address order. For each block, print a line that says "Block", followed by a space, followed by the length of the block printed in decimal with no leading zeroes, followed by a space, followed by "Free" if the block is not allocated or "Allocated" if the block is allocated, followed by a space, followed by either "Marked" if the garbage collector is active and the block has been marked by the collector or "Unmarked" if the garbage collector is inactive or the garbage collector is active and the block has not been marked, followed by a space. If the block is allocated then, on the same line, print the block's finalizer function address using eight hex digits, even if the leading digits are zeroes. If the block is allocated then also print, starting on a new line, the contents of the block using the following format: After all blocks have been printed, print a blank line.

You may organize your source code into files in any way you see fit, but the file structure should be logical and aid the reader of the code. Supply a Makefile called "Makefile" that will allow the garbage collector to be built by connecting it to a main function in a file called testGC.c. The primary Makefile goal should be "testGC" and should build an executable called "testGC". Also provide a Makefile goal called "displaySource" which will cat out all your source files to stdout. The Makefile should pass the "-Wall" flag to gcc when compiling your C source code.

To avoid namespace pollution, preface the names of all helper functions that need to be visible to the linker by "GC_". Try to minimize the number of such functions and be sure to use the "static" keyword to hide all other functions and global variables.

The memDump function should be visible to the linker. That is, do not make it a "static" function.

Your program will be graded primarily by testing it for correct functionality. Be sure the output of the debugging tool is formatted exactly as specified above! In addition, however, you may lose points if your program is not properly structured and documented. Decompose sub-problems appropriately into functions and do incremental testing. Leave your debugging output in your code, but disabled, when you do your final assignment submission.

Having a storage allocator without a garbage collector will be worth 50% of the points for the assignment. The debugging tool (memDump) is worth 20 points. The garbage collector itself is the remaining 30 points.

By the end of the lab on Friday October 1, you should have completed the memory allocator and the debugging tool (memDump). This will require that you do the memory allocator and attempt the debugging tool prior to coming to lab. In lab we will work on any issues that you are having trouble with.

By the end of the lab on Friday October 8, you should have completed the garbage collector. I will release some of my tests during the lab to help you shake out any remaining bugs.

Your programs will be graded using agate.cs.unh.edu so be sure to test in that environment.

You should submit your C and assembly source code plus the Makefile for compiling your code. Be sure you submit any header files that you have created too. And be sure you disable any debugging output before you submit your assignment.

Your programs should be submitted for grading from agate.cs.unh.edu. To turn in this assignment, type:
~cs520/bin/submit prog3 Makefile list-of-source-files

Please submit only your C and assembly source files and the Makefile. Do not turn in any other files!

Also, please do not submit a testGC.c file. I will supply my own versions of this for grading purposes.

Submissions can be checked by typing:
~cs520/bin/scheck prog3

To receive full credit for the assignment, you must turn in your files prior to 8am on Monday October 11. Programming assignments may be handed in late at a penalty of 2 points for one day late, 5 points for two days late, 10 points for three days late, 20 points for four days late, and 40 points for five days late. No program may be turned in more than 5 days late.

Remember: as always you are expected to do your own work on this assignment.


Last modified on October 3, 2010.

Comments and questions should be directed to hatcher@unh.edu