CS712/CS812
Project Phase 2
Spring 2004
Due Sunday February 29


In this phase you will design the major data structures for your compiler, as well as build and test your scanner and parser.

Implement a scanner for the language you designed in phase1. Do this in a way that you can either run the scanner as a stand-alone program or as part of your compiler. The stand-alone version of the scanner should dump tokens, one per line to stdout, in a human-readable format. The dump should include both the token name and its semantic value. Develop inputs to thoroughly test your scanner using the stand-alone version of the scanner.

Design an Abstract Syntax Tree (AST) data structure to be used by your compiler. Try to think ahead and anticipate the needs of both the semantic analysis and the code generation phases of the compiler. Be sure to capture source line numbers in the AST nodes so that good error messages can be generated during semantic analysis. Use an object-oriented approach so that you can easily modify your design later if needed.

Implement your AST design. Provide methods for constructing ASTs and a method for displaying an AST to stdout in human-readable form.

Implement a parser for your language. The parser should use the scanner you are developing for this phase. The parser should use your AST implementation to build ASTs for the input source program. The parser should display the ASTs for debugging purposes. You do not need to do syntactic error recover. Your parser can stop on the first syntax error. Construct test source programs to thoroughly test your parser.

Consider the necessary compiler handling of names in your language. What are the namespaces defined by your language and how will they be implemented in the compiler? What are the scopes of the various types of names and how will they be enforced? Design data structures for storing names in the compiler and define the interfaces necessary to create and access the data structures. Efficiency of the future implementation is not important. What is important is to think ahead and design how names will be manipulated in your compiler. You do NOT need to implement this design in this phase. You will implement this design in the next phase of the project.

Develop a Makefile for building and testing your scanner and parser, called "Makefile". The default Make target (topmost in the Makefile) should build your parser. You should have a Make target for the stand-alone scanner, called "lexdbg". You should have a Make target for running all your stand-alone scanner tests as one long string of tests, called "scantest". You should have a Make target for running all your parser tests as one long string of tests, called "parsetest". (You might want to develop scripts for running these strings of tests and then invoke the scripts from the Makefile.) Finally, have a Make target for cleaning up ALL files generated by any of the other Make targets, called "clean".

You can develop your scanner/parser on any system that you have access to, but for grading purposes I will execute it on turing.unh.edu, so be sure to test in that environment.

The deliverables for this phase include the source code for your scanner, parser and AST implementation, all the test programs that you developed, the Makefile and any other support files needed to build and test your scanner and parser, as well as the design documents for the AST and the name handling. The design documents can simply be comments in the source code or separate documents, whatever you prefer. Also include a README file with your submission that describes the purpose of each file submitted.

You should submit your files from turing.unh.edu using my "submit" script. To turn in this assignment, type:
~cs712/bin/submit phase2 list of filenames

Submissions can be checked by typing (also on turing.unh.edu):
~cs712/bin/scheck phase2

To receive full credit for the assignment, you must turn in your files prior to 8am on Monday March 1. Late submissions will be accepted at a penalty of 2 points for one day late, 5 points for two days late, 10 points for three days late, 20 points for four days late, and 40 points for five days late. No program may be turned in more than 5 days late.

Remember: you (with possibly an approved partner) are expected to do your own work on this assignment!


Last modified on February 8, 2004.

Comments and questions should be directed to hatcher@unh.edu