Project Implementation
Phase 1 Lexical analysis
In phase 1, you need to implement a lexer for the language described in “COMP3173 23F Project Description.docx”. Your lexer consists of five source files: “func.c”, “lexer.h”, “lexer.c”, “symbol_table.h”, and “symbol_table.c”. Detailed requirements are listed below.
“func.c”
- It is the main entry of the entire project (all phases).
- It opens the source program.
- The source program is passed to the main function as an argument.
- It makes calls to the function “next_token” defined in “lexer.h” and “lexer.c”.
- It maintains a symbol table defined in “symbol_table.h” and “symbol_table.c” to store all the identifiers.
- If the function “next_token” returns a token, print the token and its attribute(s) on the screen for verification purpose.
- If “next_token” returns an error flag, report the error and its location.
- After the entire process is finished, print out the symbol table.
“lexer.h” and “lexer.c”
- You need to design and implement a DFA to complete this task.
- To implement the DFA, you must use a transition table.
- You must define the transition table in “lexer.h” in a proper way (for example, as a 2-dimensional constant array).
- It reads and cuts the source program into lexemes.
- It returns the first token which is found every time when “next_token” is called.
- It ignores spaces, indentations, line breaks, and comments.
- If the token is an identifier, insert the token into the symbol table. Each token has an attribute to record its memory location in the symbol table.
- If the token is an integer literal, the token has two attributes: type, for its data type; and value, for its value.
- If the token is none of the above, it does not have an attribute.
- If there is any lexical error, returns an error flag to the main function.
“symbol_table.h” and “symbol_table.c”
- It should be implemented as a data structure (AVL tree is recommended to search and insert symbols efficiently or you can use any data structure you want. Efficiency is not required here.);
- Each identifier has its variable name and its type. (This language has two types of identifiers, integers and functions.)
- The symbol table also needs a function to print out its content.
The implementation must be done in standard C (not in Visual C). For those who do not have standard C installed on your local computer, you can try to install MinGW from https://www.mingw-w64.org/ or use online GDB at https://www.onlinegdb.com/ . TA will use make file to check your analyzer.
Example:
In the package “Example.zip”, you will find all the source files described above. Currently, they are all empty. The main function simply prints the argument on the screen. The package also contains “make.bat”. You can open it by a txt reader. Then, you can see the compilation commands.
After executing “make.bat” (suppose you are using Windows and have GCC installed), you will have “func.exe”, which is the compiled analyzer.
Suppose the source program that we want to analyze is “sample.txt”, we execute “func sample.txt”.
Submission requirements
Each team need to clearly indicate the contribution of each team member in a txt file. To submit your work, you need to pack all files (source code and contribution txt) in a package. Rename the package as COMP3173_23F_TeamXX, where XX is your team number. Only team leaders need to upload the package to iSpace.