CS611
Fall 2003
Programming Assignment 3
Due Sunday October 12


Replace the assemble module of the vm611 assembler.

The source code for the vm611 assembler is in ~cs611/public/as611. There are four entry points for the assembly module. Stubs for these four functions are in assemble.c.

The initializeAssemble function is called once at program start-up to allow the assemble module to initialize any internal data structures.

The assemble function is called once for each non-blank line of the input assembly language source program. The two parameters to this function are a string for a label and an INSTRUCTION object for an instruction. The function should encode its input in vm611 machine format in an internal data structure for later dumping to an output object file.

The INSTRUCTION object is described in defs.h. It contains a string for the opcode and an OPERAND object (also described in defs.h) for the operand if the instruction includes an operand. (If there is no operand, the OPERAND object will contain NULLoperand for the OPERAND_TYPE.) An operand is either a symbol (represented with a string) or an integer constant (represented as an "int").

For a given line, either the label or the instruction may be missing. If the label is not present, its string pointer will be NULL. If the instruction is not present, its opcode string pointer will be NULL.

The writeObjectFile function produces an object file from the internal data structure constructed by the series of calls to the assemble function. The structure of the object file is described in the vm611 webpage.

The dumpSymbolInfo function iterates through the symbol table and displays undefined symbols to stdout.

Both writeObjectFile and dumpSymbolInfo are called once, when EOF is reached on the input file.

The strings that are passed into assemble for labels and opcodes will already have been verified to be legal "identifiers" (start with a letter and made up of letters and digits). Likewise, for operands that are symbols, the strings will already be verified to be legal identifiers. For integer operands, the values have been verified to fit in 32 bits. For immediate constants, however, you will need to check whether the value will fit in 10 bits.

Comments are discarded prior to the call to the assemble function.

As well as the vm611 machine's opcodes, the assembler should support the vm611 assembler's three directives described in the vm611 assembler webpage.

If a line of the input is in error, call the error routine to format an error message. See the message.c file for details. You are only responsible for detecting one error per line, but you must be capable of detecting multiple errors in a file. You may, if you wish, simply ignore the rest of a line once an error is detected on that line. If an input file has errors, then the user of the assembler understands that the output object file is not to be trusted.

Sample vm611 assembly language programs are available in ~cs611/public/prog3.

Your goal should be to match exactly the behavior of the implementation available in ~cs611/bin/as611. Any ambiguities in this specification can be resolved by running test cases through this "official" implementation. (If you detect bugs in the "official" as611, please let me know.)

You can examine the output of your assembler by using the "octal dump" tool available in /usr/bin/od. If you have a working disassembler, you can use that also.

Your implementation must be performed using C.

Be sure to include the proper error checking: e.g. illegal opcode, constant out of range, program too big to fit in memory (either instruction or data), opcode-operand mismatch, etc.

The "single nibble" instructions (add, sub, etc) will be worth 15 points. The push and pop instructions will be worth 40 points. The b, bt and call instructions will be worth 15 points. The DATA, WORD and ALLOC directives will collectively be worth 15 points. Error handling (including the dump of the undefined symbols) will be worth 15 points.

Your program will be graded primarily by testing it for correct functionality. However, you may lose points if your program is not properly structured or adequately documented. See the mandatory guidelines given in the course overview webpage.

Your code should be submitted for grading from a CIS Linux machine (e.g. turing.unh.edu). To turn in this assignment, type:
~cs611/bin/submit prog3 assemble.c

Do not turn in any other files!

Submissions can be checked by typing:
~cs611/bin/scheck prog3

To receive full credit for the assignment, you must turn in your files prior to 8am on Monday October 13. Late submissions will be accepted at the penalty of 5 points per day up to one week late.

Your programs will be graded using a CIS Linux machine (e.g. turing.unh.edu) so be sure to test in that environment.

Remember: as always you are expected to do your own work on this assignment.


Last modified on September 25, 2003.

Comments and questions should be directed to hatcher@unh.edu