Write a set of C functions to implement a simple file manager.
A file is to appear to the user as a symbolically-named, variable-length, linearly-addressed sequence of bytes. A read or write operation moves a given number of bytes between a file and the user's address space. A file pointer is internally maintained and indicates the place in the file at which all data transfers begin.
Before a read or write operation may be performed, a logical connection must be established between the process and the file. This is known as "opening" the file; the breaking of the connection is called "closing" the file.
The user must be insulated from the details of the implementation of files. Files are stored in not-necessarily contiguous sectors of a disk. Accessing a disk is orders of magnitude slower than accessing main memory. Therefore, it is necessary to transfer some minimum amount of data (such as one sector) between main memory and disk for each disk access. By storing this minimum amount of data in a buffer, many read and write operations will not require an access to disk.
All disks contain 128 sectors and all sectors are 256 bytes.
The file manager supports a single file system per disk and only one file system at a time can be active. The file manager should maintain a single directory for each file system, which will contain information about the files stored in the system. The file manager should maintain the following information for every file in a file system:
The symbolic name is 1-8 characters, followed by a period (.), followed by 1-3 characters (the file type, or "extension"). The valid characters in file names are A through Z, a through z, and 0 through 9.
The file size should be the size of the file in bytes. Since disks have 128 sectors of 256 bytes, the largest file size is 32768 or less. Therefore the file size will fit in a 16-bit integer. On the disk this 16-bit integer should be stored in Little-Endian format.
The physical location of the file is the ordered list of sectors that contain the file. Since disks have 128 sectors, the sector numbers can be stored in an 8-bit integer.
Sector 0 of the disk is reserved to hold control information for the file system. In particular, it should hold a vector of 128 bytes, one per sector. This vector is known as the sector map. If a byte contains -1, then the corresponding sector is free (not being used to store either user-file data or file-system control data). If a byte contains 0, then the corresponding sector is the last sector in a user file (or the last sector in a file-system entity such as the directory). If a byte contains K (not -1 or 0), then the next sector in the user file (or file-system entity) is sector K.
The beginning of the directory should also be stored in sector 0. A directory is a sequence of directory entries. Each entry is 16 bytes: 13 bytes for the file name, 2 bytes for the file size, and 1 byte for the sector number of the first sector. (The file name should be stored left-justified and null-filled in the directory entry.) The first directory entry should describe the directory itself. In this case the file name should simply be a period (.), the first-sector field should contain a zero, and the size field should contain the directory size (which is always at least 16).
The sector map should be stored first in sector 0, followed by the directory.
Only data for a single file will be stored in an individual sector. That is, a single sector will never contain data from two files (or from the directory and a file). This allows the simple sector map approach described above, but has the disadvantage that there will often be wasted space in the last sector of a file.
When files are deleted, the corresponding directory entries can be freed. A directory entry is marked as being free by placing a null byte in the first byte of the file-name field. When searching for a directory entry to use for a newly created file, the file manager should start at the front of the directory looking for a free entry. If the end of the directory is reached, then a new entry should be created by extending the directory. Note that if the directory is extended beyond the end of sector 0, then the sector map should be examined to find a free sector, and that sector's number should be placed in slot 0 of the sector map. That is, a directory grows exactly like a user file grows. However, once a directory grows into a new sector, it will never shrink back out of that sector. So, if a directory grows into another sector, but later all directory entries in that sector are marked as free, then the file manager should maintain the directory size and not try to reclaim the sector.
When searching for free sectors (to use either for user files or for the directory), the file manager should always choose the smallest numbered free sector. That is, search the sector map starting with sector 0. (Well, sector 1 actually, since sector 0 will always be in use.)
The file manager will also have to maintain in memory a table of open files. Allow a maximum of 20 simultaneously open files. This table will contain the location of the buffer for the file, what mode the file was opened for, the current file position (also known as the file pointer), and possibly other information. The file descriptor used by many of the file-system primitives is an index into this table.
One data sector and the directory entry for each open file may be buffered in memory. Sector 0 may also be buffered in main memory. Nothing else from the disk should be buffered in memory. You should update sector 0 on disk whenever the in-memory copy of the sector map is modified. Also when a file is closed, if its directory entry has been changed, the directory entry should be updated on the disk.
The data sector buffer should only be filled or re-filled by a read or write operation. When a file is opened, the buffer is left "empty". Only when a subsequent read or write is performed will a sector be read into the buffer. Likewise, if a read or write operation ends on the last byte in the buffer, the buffer won't be re-filled until the next read or write operation is performed. Note that an lseek should never cause a buffer to be filled or re-filled.
Do not allocate sectors for a file until you are forced to when a write is performed which goes beyond the end of file. So, when a new file is created, no data sector is allocated until a write is performed to the file. However, allocate a sector before you write any data into its buffer. That is, every buffer should be backed up by an allocated sector.
Your implementation of the file system will be partially evaluated on the number of disk operations performed. You should minimize the number of disk operations, while respecting the rules given in the preceding paragraphs.
The following functions will provide the entry points to the file manager:
If you need to allocate a sector and there are no available sectors, then the executing file system function should return a failure status. You should not print an error message.
The file manager will interface to a disk by means of a disk device driver function:
Each file system entry function will be worth 10 points.
The device driver function and some simple tests will be available in ~cs611/public/prog6. Additional hidden tests may be used as well.
Your implementation should be performed using C. Put all your C code in the fs.c file.
Your program will be graded primarily by testing it for correct functionality. However, you may lose points if your program is not properly structured or adequately documented. See the mandatory guidelines given in the course overview webpage.
Your assignment should be submitted for grading from a
CIS Linux machine (e.g. turing.unh.edu).
Submit a single file, fs.c.
To turn in this assignment, type:
~cs611/bin/submit prog6 fs.c
Do not turn in any other files!
Submissions can be checked by typing:
~cs611/bin/scheck prog6
To receive full credit for the assignment, you must turn in your files prior to 8am on Thursday December 2. Programming assignments may be handed in late at a penalty of 2 points for one day late, 5 points for two days late, 10 points for three days late, 20 points for four days late, and 40 points for five days late. No program may be turned in more than 5 days late.
Your programs will be graded using an CIS Linux machine (e.g. turing.unh.edu) so be sure to test in that environment.
Remember: as always you are expected to do your own work on this assignment.
Comments and questions should be directed to hatcher@unh.edu