CS 530
Assignment 1 - Parsing Object Code Text Records
Points: 80
Due Date: Beginning of the class, 10/02/2024
Introduction
As we move on to the discussion on Assemblers, you will be implementing a disassembler as the major programming exercise for the course. A disassembler is a computer program that translates the machine language into assembly language. Since creating a disassembler is rather a large task, the work will be split between two assignments.
Task
This first assignment will be focused on reading and parsing the SIC/XE object code and printing out the details of each instruction in the text records of the object code in the order of how they appear in the object code. Each instruction has its opcode (mnemonic), operand/s, and addressing mode/s. The composition of the object code structure includes the header record, the text records where each text record contains a series of instructions, modification records (if there are any), and the end record. In this assignment, you would ONLY need to focus on the text records.
Input and Output Files
● Input File:
o Object Code file: test.obj
o This file name would be passed as a command line argument to your project. ● Output File:
o Object code structure in text form: obj_struct.txt. ▪ Note:
-
● Your program should produce the output file with the exact name obj_struct.txt to work with the autograder.
-
● The obj_struct.txt file should follow exactly what appears in the sample example_obj_struct.txt using the exact format.
o The output file will have the following 5 columns in that order: Instruction, Format, Operand Addressing Type, Target Address Addressing Mode, and Object Code.
▪ Instruction – print the mnemonic (big case) of the instruction: LDA, ADD, ...
-
▪ Format – print the format number: 2, 3, or 4. Do NOT need to consider format 1 in this assignment.
-
▪ Operand Addressing Type – print the addressing mode for computing the operand: simple, immediate, indirect
-
▪ Target Address Addressing Mode – print the addressing mode for computing the
target address: pc, base, or absolute (equivalent to direct)
● If indexed addressing mode is used, then print pc_indexed, base_indexed,
or direct_indexed.
▪ Object Code – print Object code for each instruction
o The output file should print the first line with the column header acronyms (exactly as below):
▪ INSTR FORMAT OAT TAAM OBJ
o There need to be some spaces separating columns in the obj_struct.txt. The number of
spaces for the separation is flexible. It is recommended to set the columns with a consistent
width of 16 characters so the output format looks consistent with autograder output.
o Some columns might be blank, such as format 2 addressing modes, if that is the case, leave
the addressing mode columns blank. (It would be good to still add spaces to keep columns
lined up for easier reading.)
Required Features
-
● Your program MUST support format 2, 3, 4 instructions and does NOT need to support format 1 instructions.
-
● Your program shall support the parsing of all 59 instructions (except format 1 instructions) and their corresponding mnemonics in the SIC/XE architecture, for example, the mapping between the following two arrays. Note: you do NOT have to use the following data structures in your assignment implementation.
const static string ops[] = { "18", "58", "90", "88", "A0", "24", "C0", "F4", "3C", "48", "00", "68", "6C", "74", "04", "98", "C8", "44", "A4", "A8", "F0",
"54", "80", "D4", "84", "10", "1C", "E0", "F8", "2C",
"40", "B4", "28", "64", "9C", "C4", "30", "34", "38", "50", "70", "08", "D0", "20", "60", "D8", "AC", "4C", "EC", "0C", "78", "14", "7C", "E8", "5C", "94", "B0",
"B8", "DC" const static string mnemonics[] = {
};
"ADD", "ADDF", "ADDR", "AND", "CLEAR", "COMP",
"COMPF", "COMPR", "DIV", "DIVF", "DIVR", "FIX",
"FLOAT", "HIO", "J", "JEQ", "JGT", "JLT",
"JSUB", "LDA", "LDB", "LDCH", "LDF", "LDL",
"LDS", "LDT", "LDX", "LPS", "MUL", "MULF", "MULR", "NORM", "OR", "RD", "RMO", "RSUB", "SHIFTL", "SHIFTR", "SIO", "SSK", "STA", "STB", "STCH", "STF", "STI", "STL","STS", "STSW", "STT", "STX", "SUB", "SUBF", "SUBR", "SVC", "TD", "TIO", "TIX", "TIXR", "WD"
};
const static bool format2[] = { false,false,true,false,true,false,
false,true,false,false,true,false, false,false,false,false,false,false, false,false,false,false,false,false, false,false,false,false,false,false, true,false,false,false,true,false, true,true,false,false,false,false, false,false,false,false,false,false, false,false,false,false,true,true, false,false,false,true,false
};
Programming and testing
You must use C/C++ for this assignment. The compiler used for autograder is gcc version 4.8.5 or version 7.
-
● Refer to “C/C++ programming in Linux / Unix” page.
-
● You may use C++ 11 standard for this assignment, use the compilation flag -std=c++11, see the given
Makefile.
-
● You are strongly recommended to set up your local development environment under a Linux
environment (e.g., Ubuntu 18.04 or higher), develop and test your code there first, then port your code to Edoras (e.g., filezilla or scp) to compile and test to verify. The gradescope autograder will use a similar environment as Edoras to compile and autograde your code.
Compilation and execution
-
Compilation: The command make will be used to compile your program. You MUST provide an appropriate makefile, an example makefile file can be found at
-
Executable Name: Please generate your disassembler executable file with a name disassem. The autograder will fail, and you will lose points if we need to modify your executable name to make the
autograder work. Make sure to use the “-o” disassem flag in your Makefile compilation command, also
the Makefile should use a “-g” compile flag to enable gdb debugging.
• Before submitting, port your code to Edoras (via filezilla or scp) to compile and test to verify. Note the
gradescope autograder will use a similar environment as Edoras to compile and autograde your code.
• ExecutableCommandLineArguments:Thedisassemblershouldusetheobjectcodefilenameasthe command line arguments for execution. If the object code file is not present, then the program should exit by displaying a message “missing the input file” asking for the missing input file. The following shows how your executable will be called:
./disassem test.obj
Grading
Passing tests against your program execution may NOT give you a perfect score. The satisfaction of the requirements (see above), your code structure, coding style, and comments will also be part of the rubrics (see Syllabus Course Design - assignments). Your code shall follow industry best practices:
-
● Be sure to comment on your code appropriately. Code with no or minimal comments is automatically lowered by one grade category.
-
● NO hard code – Magic numbers, etc.
-
● Have proper code structure between .h and .c / .cpp files, do not #include .cpp files.
-
● Design and implement clean interfaces between modules.
Turning In
Submit the following program artifacts on Gradescope. Make sure that all submitted files contain your name and Red ID.
● Program Artifacts
o Source code files (.h, .hpp, .cpp, .C, or .cc files, etc.), Makefile.
o Submit files as they are, DO NOT compress files into a zip file.
-
● Single Programmer Affidavit (refer to here), no digital signature is required, type your name and Red ID as the signature.
-
● Number of submissions
o Please note the autograder submission count when submitting on Gradescope. For this
assignment, you will be allowed 20 submissions, but future assignments will be limited to around 10 submissions. As stressed in the class, you are supposed to do the testing in your own dev environment instead of using the autograder for testing your code. It is also the responsibility of you as the programmer to sort out the test cases based on the requirement specifications instead of relying on the autograder to give the test cases.
Late Submission Policy
Refer to the Syllabus for the Late Submission Policy.