CS611
Programming Assignment 6
Fall 2004
Due Wednesday December 1


Write a set of C functions to implement a simple file manager.

A file is to appear to the user as a symbolically-named, variable-length, linearly-addressed sequence of bytes. A read or write operation moves a given number of bytes between a file and the user's address space. A file pointer is internally maintained and indicates the place in the file at which all data transfers begin.

Before a read or write operation may be performed, a logical connection must be established between the process and the file. This is known as "opening" the file; the breaking of the connection is called "closing" the file.

The user must be insulated from the details of the implementation of files. Files are stored in not-necessarily contiguous sectors of a disk. Accessing a disk is orders of magnitude slower than accessing main memory. Therefore, it is necessary to transfer some minimum amount of data (such as one sector) between main memory and disk for each disk access. By storing this minimum amount of data in a buffer, many read and write operations will not require an access to disk.

All disks contain 128 sectors and all sectors are 256 bytes.

The file manager supports a single file system per disk and only one file system at a time can be active. The file manager should maintain a single directory for each file system, which will contain information about the files stored in the system. The file manager should maintain the following information for every file in a file system:

The symbolic name is 1-8 characters, followed by a period (.), followed by 1-3 characters (the file type, or "extension"). The valid characters in file names are A through Z, a through z, and 0 through 9.

The file size should be the size of the file in bytes. Since disks have 128 sectors of 256 bytes, the largest file size is 32768 or less. Therefore the file size will fit in a 16-bit integer. On the disk this 16-bit integer should be stored in Little-Endian format.

The physical location of the file is the ordered list of sectors that contain the file. Since disks have 128 sectors, the sector numbers can be stored in an 8-bit integer.

Sector 0 of the disk is reserved to hold control information for the file system. In particular, it should hold a vector of 128 bytes, one per sector. This vector is known as the sector map. If a byte contains -1, then the corresponding sector is free (not being used to store either user-file data or file-system control data). If a byte contains 0, then the corresponding sector is the last sector in a user file (or the last sector in a file-system entity such as the directory). If a byte contains K (not -1 or 0), then the next sector in the user file (or file-system entity) is sector K.

The beginning of the directory should also be stored in sector 0. A directory is a sequence of directory entries. Each entry is 16 bytes: 13 bytes for the file name, 2 bytes for the file size, and 1 byte for the sector number of the first sector. (The file name should be stored left-justified and null-filled in the directory entry.) The first directory entry should describe the directory itself. In this case the file name should simply be a period (.), the first-sector field should contain a zero, and the size field should contain the directory size (which is always at least 16).

The sector map should be stored first in sector 0, followed by the directory.

Only data for a single file will be stored in an individual sector. That is, a single sector will never contain data from two files (or from the directory and a file). This allows the simple sector map approach described above, but has the disadvantage that there will often be wasted space in the last sector of a file.

When files are deleted, the corresponding directory entries can be freed. A directory entry is marked as being free by placing a null byte in the first byte of the file-name field. When searching for a directory entry to use for a newly created file, the file manager should start at the front of the directory looking for a free entry. If the end of the directory is reached, then a new entry should be created by extending the directory. Note that if the directory is extended beyond the end of sector 0, then the sector map should be examined to find a free sector, and that sector's number should be placed in slot 0 of the sector map. That is, a directory grows exactly like a user file grows. However, once a directory grows into a new sector, it will never shrink back out of that sector. So, if a directory grows into another sector, but later all directory entries in that sector are marked as free, then the file manager should maintain the directory size and not try to reclaim the sector.

When searching for free sectors (to use either for user files or for the directory), the file manager should always choose the smallest numbered free sector. That is, search the sector map starting with sector 0. (Well, sector 1 actually, since sector 0 will always be in use.)

The file manager will also have to maintain in memory a table of open files. Allow a maximum of 20 simultaneously open files. This table will contain the location of the buffer for the file, what mode the file was opened for, the current file position (also known as the file pointer), and possibly other information. The file descriptor used by many of the file-system primitives is an index into this table.

One data sector and the directory entry for each open file may be buffered in memory. Sector 0 may also be buffered in main memory. Nothing else from the disk should be buffered in memory. You should update sector 0 on disk whenever the in-memory copy of the sector map is modified. Also when a file is closed, if its directory entry has been changed, the directory entry should be updated on the disk.

The data sector buffer should only be filled or re-filled by a read or write operation. When a file is opened, the buffer is left "empty". Only when a subsequent read or write is performed will a sector be read into the buffer. Likewise, if a read or write operation ends on the last byte in the buffer, the buffer won't be re-filled until the next read or write operation is performed. Note that an lseek should never cause a buffer to be filled or re-filled.

Do not allocate sectors for a file until you are forced to when a write is performed which goes beyond the end of file. So, when a new file is created, no data sector is allocated until a write is performed to the file. However, allocate a sector before you write any data into its buffer. That is, every buffer should be backed up by an allocated sector.

Your implementation of the file system will be partially evaluated on the number of disk operations performed. You should minimize the number of disk operations, while respecting the rules given in the preceding paragraphs.

The following functions will provide the entry points to the file manager:

  1. int createFS(DISK disk): Creates a new file system on the given disk. Initially the file system contains no files. Returns 1 if successful and 0 if not.

  2. int bootFS(DISK disk): Opens an existing file system on the given disk. Returns 1 if successful and 0 if not. Only one file system at a time can be "booted". That is, it is an error if a second file system is booted without first shutting down the file system that was booted earlier.

  3. int shutdownFS(void): Closes an active file system. Any persistent data structures must be written to the disk so that the system can be booted at a later time. Returns 1 if successful and 0 if not. It is an error if the user attempts to shutdown a file system with open files.

  4. int openFile(char* name, int mode): Opens the named file for the given mode and assigns a file descriptor (fd) to it. The mode must be either READ_MODE, WRITE_MODE or UPDATE_MODE. (UPDATE_MODE means that both reading and writing is allowed.) The file pointer for the file is set to the beginning of the file. If successful, it returns the fd; else it returns -1. To be successful, the file must exist and must not already be open.

  5. int createFile(char* name, int mode): Same as openFile except, if the file exists, it scratches it; and it will create the file if it doesn't already exist. It is an error if the file is already open.

  6. int closeFile(int fd): Closes the file with the given fd. If successful, it returns the fd; else it returns -1.

  7. int readFile(int fd, char* buf, int size): Reads up to size characters from the file specified by the fd into the given buffer. The file pointer is advanced by the amount of data read. If an error occurs, a -1 is returned. Otherwise, the actual number of bytes read is returned. Note that the function may return a number less than size if EOF is encountered. If EOF is encountered before any data is read, then the function returns zero. For the read operation to be successful, the file must have been opened with either READ_MODE or UPDATE_MODE.

  8. int writeFile(int fd, char* buf, int size): Writes size characters starting at buf to the file specified by the fd. The file pointer is advanced by the amount of data written. (Data may be written beyond the end of file: the file length is extended.) If an error occurs, -1 is returned. Otherwise, the number of actual bytes written is returned. A number less than size might be returned, if a data sector needs to be allocated and there are no free sectors. For the write operation to be successful, the file must have been opened with either WRITE_MODE or UPDATE_MODE.

  9. int unlinkFile(char* name): Delete the file with the given name. Returns 1 if successful and 0 if not. It is an error to call unlinkFile on an open file.

  10. int lseekFile(int fd, int offset, int sense): Use the given offset to modify the file pointer for the given fd, under control of sense. If sense is equal to 0, the pointer is set to offset, which should be positive. If sense is equal to 1, the offset is added to the current pointer. If successful, it returns the fd; else it returns -1. (An lseek beyond the end of the file is not allowed.)

If you need to allocate a sector and there are no available sectors, then the executing file system function should return a failure status. You should not print an error message.

The file manager will interface to a disk by means of a disk device driver function:

The operation must be either READ_SECTOR or WRITE_SECTOR and the given sector is either read into the buffer or the given sector is written from the given buffer. This function does not return anything as any error is fatal, and terminates the program with an error message. This function will be provided to you.

Each file system entry function will be worth 10 points.

The device driver function and some simple tests will be available in ~cs611/public/prog6. Additional hidden tests may be used as well.

Your implementation should be performed using C. Put all your C code in the fs.c file.

Your program will be graded primarily by testing it for correct functionality. However, you may lose points if your program is not properly structured or adequately documented. See the mandatory guidelines given in the course overview webpage.

Your assignment should be submitted for grading from a CIS Linux machine (e.g. turing.unh.edu). Submit a single file, fs.c. To turn in this assignment, type:
~cs611/bin/submit prog6 fs.c

Do not turn in any other files!

Submissions can be checked by typing:
~cs611/bin/scheck prog6

To receive full credit for the assignment, you must turn in your files prior to 8am on Thursday December 2. Programming assignments may be handed in late at a penalty of 2 points for one day late, 5 points for two days late, 10 points for three days late, 20 points for four days late, and 40 points for five days late. No program may be turned in more than 5 days late.

Your programs will be graded using an CIS Linux machine (e.g. turing.unh.edu) so be sure to test in that environment.

Remember: as always you are expected to do your own work on this assignment.


Last modified on November 16, 2004.

Comments and questions should be directed to hatcher@unh.edu