Implement an assembler for the subset of the Alpha instructions given here. To accomplish this task you must implement the four functions that are stubbed out in the file assemble.c. This code will interact with the rest of the code given to you in ~cs611/public/prog3 to implement a simple assemble-and-execute environment for Alpha assembly programs.
The key function to be implemented is assemble, which takes a label and/or an instruction to be assembled. The label parameter is a string, which will be NULL if no label is present. The instruction parameter is an INSTRUCTION, which is a struct containing an opcode and up to four operands. The opcode is a string. Each operand is represented as an OPERAND, which is a struct with a field indicating the type of the operand and a union that contains either a string (register or label name) or an int (integer constant). See defs.h for the details.
You should assemble the instructions into a buffer in memory. The size of this buffer is given by the constant MAX_ADDRESS in defs.h. The instructions should be put consecutively into memory in the order that they are encountered. The function returnPointerToObjectCode returns the base address of the instruction buffer. The function returnLengthOfObjectCode returns the current amount (in bytes) of code in the buffer.
The function initializeAssemble will be called once at program start-up and gives you an opportunity to initialize your internal data structures. Depending on how you do things, this function might in fact do nothing.
All letters in strings passed to assemble will be only lowercase. Registers will only be indicated with their numeric forms ($26, not $ra). (Actually $sp is supported but when assemble is called it will be represented as a register operand with number 30.)
The legal formats for the instructions are as specified in the "Alpha Subset" document except the lda, beq, bne and br instructions can take a label in place of the offset/register pair. You should assume that register $27 will contain at run time the base address of the code segment. So a label used with lda will simply need to be assembled as its byte offset from the code base, combined with $27. For the branch instructions a label will be assembled as the appropriate instruction offset.
For Operate-format instructions, the operands will appear in operand slots 1, 2 and 3. For Memory-format instructions, the leftmost register will be in first operand slot, the second operand slot will be empty, the third operand slot will contain the offset constant, and the fourth operand slot will contain the base register. (For the jsr and ret instructions, the third operand slot will also be empty.) For Branch-format instructions, the two operands will be in operand slots 1 and 2.
The assemble function will only be called once for every instruction in the input file. This means you must complete the assembly process with a single pass over the instructions. When you encounter a reference to a label that has not yet been defined, you should store in the symbol table with the label the negation of the memory byte offset of the instruction making the reference. If there are multiple instructions referencing the not-yet-defined label, then use the address field "holes" to chain together the instructions. When the label is defined, traverse the chain of referencing instructions and fill in the holes.
You should be sure to do the appropriate error checking. For example, the first operand of the lda instruction must be a register (and not an integer constant).
Your program will be graded primarily by testing it for correct functionality:
However, remember, you may lose points if your program is not properly structured or adequately documented.
Only the distributed test cases will be used for grading this assignment.
Your programs will be graded using an Alpha machine so be sure to test in that environment. In fact, you can only run the assembled programs if you are using an Alpha. You could dump the assembled programs when running on other architectures, however.
Your programs should be submitted for grading from a UNH
CIS Alpha machine (e.g. alberti.unh.edu).
To turn in this assignment, type:
~cs611/bin/submit prog3 assemble.c
Submissions can be checked by typing:
~cs611/bin/scheck prog3
To receive full credit for the assignment, you must turn in your files prior to 8am on Monday April 2. Late submissions will be accepted at the penalty of 5% per day up to one week late.
Remember: as always you are expected to do your own work on this assignment.
Comments and questions should be directed to pjh@cs.unh.edu