CS520
Spring 2018
Programming Assignment 6
Due Wednesday May 2


The goal of this assignment is to build a memory system simulator.

A memory system includes both a main memory and one or more memory caches. The memory consists of a sequence of 32-bit words, with each word having a 32-bit address. The memory words are addressed starting with zero.

The length of memory and the number of caches are specified when the memory system is initialized.

Each cache is an array of sets. Each set contains an array of lines. Each line contains a block plus control information. Each block is an array of words.

The caches for a memory system are all identical and are configured when the memory system is initialized. The configuration includes specifying the number of cache sets, the number of cache lines per set, and the number of words per cache block.

Each cache is numbered, beginning with zero. A cache number must be passed to the read/write functions to identify which cache is to be used to access memory.

All caches are write-back caches and utilize LRU replacement.

The configuration of the memory system has the following constraints:

The public interface to the simulator is the following eight functions:

void *initializeMemorySystem(unsigned int memoryLength, unsigned int numberOfCaches, unsigned int wordsPerBlock, unsigned int numberOfSets, unsigned int linesPerSet);
int readInt(void *handle, unsigned int cache, unsigned int address);
float readFloat(void *handle, unsigned int cache, unsigned int address);
void writeInt(void *handle, unsigned int cache, unsigned int address, int value);
void writeFloat(void *handle, unsigned int cache, unsigned int address, float value);
void lock(void *handle, unsigned int cache);
void unlock(void *handle, unsigned int cache);
void printStatistics(void *h);
void deleteMemorySystem(void *h);

initializeMemorySystem initializes a memory system, configured according to its five parameters. All caches are initialized to be empty. It returns a "handle" for reading or writing to memory utilizing the caches. If the memory system initialization fails, then NULL is returned. (However, if malloc fails during the initialization, treat this as a fatal error: print an error message to stderr and call exit with -1.) Multiple memory systems can be in operation at the same time.

readInt and readFloat read the word at the specified address from the memory system denoted by the specified handle, utilizing the specified cache. The cache will first be checked to see if the word is present in the cache. If so, the word is read from the cache. If the word is not in the cache, then the block containing the word will be first loaded into the cache, and then the desired word will be accessed from the cache.

writeInt and writeFloat write the specified value into the word at the specified address of the memory system denoted by the specified handle, utilizing the specified cache. The cache will first be checked to see if the word to be updated is present in the cache. If so, the value is written to the word in the cache. If the word is not in the cache, then the block containing the word will be first loaded into the cache, and then the value will be written to the desired word.

If the set where a block is to be loaded is full, then the least-recently-used block in the set is chosen to be replaced. This is done by recording a timestamp in a cache line when a block is read into the cache line, and also each time the cache line is subsequently accessed. This timestamp is simply a running count of the number of read/write requests that have been made to a particular cache. The cache line with the lowest timestamp is chosen for replacement.

The write-back strategy must track whether a block has been modified since it was loaded into the cache. When the block is replaced, if the block has been modified, then the modified words in the block should be written back to the memory.

lock and unlock allow a CPU to lock or unlock a global lock shared by all CPUs. A call to lock will block the execution of the calling CPU until the lock becomes available. A call to unlock will give ownership of the lock to a waiting CPU, if there is one. lock and unlock have implications for memory consistency, which will be described below.

deleteMemorySystem should free all memory that was allocated to implement a memory system. Be sure to use valgrind to ensure that all memory was freed.

If a bad address or a bad cache number is passed to a read or a write function, or to the lock or unlock functions, print an error message to stderr and call exit with -1. You do not need to do any validation on the handle, however.

The printStatistics function prints to stdout, for each cache in a memory system, the total number of reads issued via the cache, the number of reads that hit in the cache, the total number of writes issued via the cache, and the number of writes that hit in that cache.

Here is an example of how the printStatistics output should look:

Cache 0:
  total number of reads: 256
  reads that hit in cache: 192
  total number of writes: 256
  writes that hit in cache: 192
Cache 1:
  total number of reads: 256
  reads that hit in cache: 0
  total number of writes: 0
  writes that hit in cache: 0

Multiple caches are supported in order to simulate a memory system interacting with multiple CPUs, each with its own cache. In the simulation, the multiple CPUs will be represented by multiple POSIX threads making concurrent requests on the memory system. Use a POSIX mutex to protect writes to the main memory. When a word in a cache needs to be written to main memory, lock this mutex, write the word to main memory and unlock the mutex. This will serialize writes to main memory when there are multiple CPUs.

Use a second POSIX mutex to implement lock and unlock. lock should lock this mutex and empty the specified cache, first writing any modified words to main memory. unlock should write any modified words in the specified cache to main memory, reset the cache so that all blocks are now considered to be unmodified, and then unlock the mutex. This provides a "relaxed" memory consistency model for the memory system.

Some simple main programs that you can use to do your initial testing are available on agate in ~cs520/public/prog6.

Your program will be graded primarily by testing it for correct functionality. In addition, however, you may lose points if your program is not properly structured and documented. Decompose sub-problems appropriately into functions and do incremental testing. Leave your debugging output in your code, but disabled, when you do your final assignment submission.

Supporting direct-mapped caches will be worth 60 points. Supporting set-associative caches will be worth 20 points. Supporting multiple caches utilized by separate threads (one thread per cache) will be worth 20 points.

Your programs will be graded using agate.cs.unh.edu so be sure to test in that environment.

Be sure you disable any debugging output before you submit your assignment.

You should submit all the source code for your assignment in one file called memSystem.c.

Your programs should be submitted for grading from agate.cs.unh.edu. To turn in this assignment, type:
~cs520/bin/submit prog6 memSystem.c

Submissions can be checked by typing:
~cs520/bin/scheck prog6

This assignment is due Wednesday May 2. The standard late policy concerning late submissions will be in effect. See the course overview webpage.

Remember: as always you are expected to do your own work on this assignment. Copying code from another student or from sites on the internet is explicitly forbidden!


Last modified on February 22, 2018.

Comments and questions should be directed to pjh@cs.unh.edu