CS611
Fall 2003
Programming Assignment 7
Due Sunday December 14


Write a set of C functions to implement a simple file manager.

A file is to appear to the user as a symbolically-named, variable-length, linearly-addressed sequence of bytes. A read or write operation moves a given number of bytes between a file and the user's address space. A file pointer is internally maintained and indicates the place in the file at which all data transfers begin.

Before a read or write operation may be performed, a logical connection must be established between the process and the file. This is known as "opening" the file; the breaking of the connection is called "closing" the file.

The user must be insulated from the details of the implementation of files. Files are stored in not-necessarily contiguous sectors of a disk. Accessing a disk is orders of magnitude slower than accessing main memory. Therefore, it is necessary to transfer some minimum amount of data (such as one sector) between main memory and disk for each disk access. By storing this minimum amount of data in a buffer, many read and write operations will not require an access to disk.

All disks contain 128 sectors and all sectors are 256 bytes.

The file manager supports a single file system per disk and only one file system at a time can be active. The file manager should maintain a single directory for each file system, which will contain information about the files stored in the system. The file manager should maintain the following information for every file in a file system:

The symbolic name is a null-terminated string of ASCII characters. The length of the string (not counting the NULL) is from 0 to 31 characters. Any ASCII character can be in the string (except a NULL).

The file size should be the size of the file in bytes. Since disks have 128 sectors of 256 bytes, the largest file size is 32768 or less. Therefore the file size will fit in a 16-bit integer. On the disk this 16-bit integer should be stored in Big-Endian format.

The physical location of the file is the ordered list of sectors that contain the file. Since disks have 128 sectors, the sector numbers can be stored in an 8-bit integer.

Sector 0 of the disk is reserved to hold control information for the file system. In particular, it should hold a vector of 128 bytes, one per sector. This vector is known as the sector map. If a byte contains 0, then the corresponding sector is free (not being used to store either user-file data or file-system control data). If a byte contains 1, then the corresponding sector is in use to store either user-file data or file-system control data.

The sector map should be stored in the first 128 bytes of sector 0. Each byte in the second 128 bytes should always contain the value 1.

The beginning of the directory should be stored in sector 1.

A directory is a sequence of directory entries. An entry is of arbitrary length and contains the following fields in the following order:

  1. The length in bytes of the directory entry itself. This is stored in a two-byte, Big Endian integer. The length includes the two-byte length field.

  2. The filename (including the null terminator).

  3. The file size stored as a two-byte, Big Endian integer.

  4. A sequence of sector numbers that represent the sectors that contain the data of the file. The sector numbers appear in the sequence in the order that the corresponding data is stored in the file. Each sector number is stored in a single byte.

The first directory entry should describe the directory itself. The filename field is unused but should contain an empty string. The file size field should contain the directory size. The first sequence number should be 1. Therefore the directory entry for the directory itself should always be at least six bytes long.

Directory entries for newly created files are always added at the end of the directory. When a file is deleted, the directory is compressed: the entries following the deleted entry are moved up toward the front of the directory and directory length is decreased. (Note: this might free a sector, causing the sector map to be modified.) When a file adds a data sector, and the file's directory entry must be extended, the following directory entries are moved down one byte and the file length is increased by 1. (Note: this might require a sector to be allocated and added to the directory's sector sequence, thus causing both the sector map and the directory's directory entry to be modified.)

Only data for a single file will be stored in an individual sector. That is, a single sector will never contain data from two files (or from the directory and a file). This allows the simple sector map approach described above, but has the disadvantage that there will often be wasted space in the last sector of a file (or the last sector of the directory).

When searching for free sectors (to use either for user files or for the directory), the file manager should always choose the smallest numbered free sector. That is, search the sector map starting with sector 0. (Well, sector 2 actually, since sectors 0 and 1 will always be in use.)

When a new file is created, it is initially empty, which means its length is 0 and its sector sequence is empty. (That is, no data sector is initially allocated for a newly created file.)

When writing to a file, no new data sector is allocated until data is written "off the end" of the last data sector. The allocation of a new data sector requires that the file's directory be extended, which may also require a sector be allocated. The sector for the directory should be allocated after the data sector is allocated.

If a sector needs to be allocated (either because the directory is being extended or a file is being extended) but all sectors are in use, then the primitive (either createFile or writeFile) that is trying to allocate the sector should return the failure code (-1). The exact state of the file system after such an error is implementation defined.

The file manager will also have to maintain in memory a table of open files. Allow a maximum of 10 simultaneously open files. This table will contain the location of the buffer for the file, what mode the file was opened for, the current file position (also known as the file pointer), and possibly other information. The file descriptor used by many of the file-system primitives is an index into this table.

Only one data sector for each open file, one sector of the directory and sector 0 may be buffered in memory. In addition you may keep the directory entry for an open file in memory. Nothing else from the disk should be buffered in memory.

You should try to minimize the number of disk operations. A disk sector that is buffered in memory should only be written back to disk if the sector has been modified since it was loaded into memory.

The following functions will provide the entry points to the file manager:

  1. int createFS(DISK disk): Creates a new file system on the given disk. Initially the file system contains no files. Returns 1 if successful and 0 if not.

  2. int bootFS(DISK disk): Opens an existing file system on the given disk. Returns 1 if successful and 0 if not. Only one file system at a time can be "booted". That is, it is an error if a second file system is booted without first shutting down the file system that was booted earlier.

  3. int shutdownFS(void): Closes an active file system. Any persistent data structures must be written to the disk so that the system can be booted at a later time. Returns 1 if successful and 0 if not. It is an error if the user attempts to shutdown a file system with open files.

  4. int openFile(char* name, int mode): Opens the named file for the given mode and assigns a file descriptor (fd) to it. The mode must be either READ_MODE, WRITE_MODE or UPDATE_MODE. (UPDATE_MODE means that both reading and writing is allowed.) The file pointer for the file is set to the beginning of the file. If successful, it returns the fd; else it returns -1. To be successful, the file must exist and must not already be open.

  5. int createFile(char* name, int mode): Same as openFile except, if the file exists, it scratches it; and it will create the file if it doesn't already exist. It is an error if the file is already open.

  6. int closeFile(int fd): Closes the file with the given fd. If successful, it returns the fd; else it returns -1.

  7. int readFile(int fd, char* buf, int size): Reads up to size characters from the file specified by the fd into the given buffer. The file pointer is advanced by the amount of data read. If EOF is detected, the function returns the number of bytes read (perhaps zero) before encountering the EOF. For the read operation to be successful, the file must have been opened with either READ_MODE or UPDATE_MODE. If an error occurs, a -1 is returned.

  8. int writeFile(int fd, char* buf, int size): Writes size characters starting at buf to the file specified by the fd. The file pointer is advanced by the amount of data written. (Data may be written beyond the end of file: the file length is extended.) If an error occurs, -1 is returned. Otherwise, size is returned. For the write operation to be successful, the file must have been opened with either WRITE_MODE or UPDATE_MODE.

  9. int unlinkFile(char* name): Delete the file with the given name. Returns 1 if successful and 0 if not. It is an error to call unlinkFile on an open file.

  10. int lseekFile(int fd, int offset, int sense): Use the given offset to modify the file pointer for the given fd, under control of sense. If sense is equal to 0, the pointer is set to offset, which should be positive. If sense is equal to 1, the offset is added to the current pointer. If successful, it returns the fd; else it returns -1. (An lseek beyond the end of the file is not allowed. An lseek off the front of the file is also not allowed.)

The file manager will interface to a disk by means of a disk device driver function:

The operation must be either READ_SECTOR or WRITE_SECTOR and the given sector is either read into the buffer or the given sector is written from the given buffer. This function does not return anything as any error is fatal, and terminates the program with an error message. This function will be provided to you.

Each file system entry function will be worth 10 points.

The device driver function and some simple tests will be available in ~cs611/public/prog7. Additional hidden tests may be used as well.

Your implementation should be performed using C. Put all your C code in the fs.c file.

Your program will be graded primarily by testing it for correct functionality. However, you may lose points if your program is not properly structured or adequately documented.

Your assignment should be submitted for grading from a CIS Linux machine (e.g. turing.unh.edu). Submit a single file, fs.c. To turn in this assignment, type:
~cs611/bin/submit prog7 fs.c

Do not turn in any other files!

Submissions can be checked by typing:
~cs611/bin/scheck prog7

To receive full credit for the assignment, you must turn in your files prior to 8am on Monday December 15. Late submissions will be accepted at the penalty of 5 points per day up to 8am on Friday December 19. No submissions will be accepted after 8am on Friday December 19!

Your programs will be graded using an CIS Linux machine (e.g. turing.unh.edu) so be sure to test in that environment.

Remember: as always you are expected to do your own work on this assignment.


Last modified on December 3, 2003.

Comments and questions should be directed to hatcher@unh.edu