COMP2017 COMP9017 Assignment 2
Due: 11:59PM Tuesday 28 March 2023 local Sydney time This assignment is worth 5% + 30% of your final assessment
Task Description
Ah, yes, because what the world really needs right now is yet another virtual machine. But not just any virtual machine, no! We need one with the highly coveted and incredibly useful feature of heap banks. Because who needs to worry about memory allocation when you can just throw it in a heap, am I right? So gather ’round, folks, and get ready to develop the vm_RISKXVII, because nothing says "cutting edge" like a virtual machine named after a board game that this assignment has absolutely nothing to do with. Now, let’s dive into the specs and get this party started!
In this assignment you will be implementing a simple virtual machine. Your program will take a single command line argument being the path to the file containing your RISK-XVII assembly code. Before attempting this assignment it would be a good idea to familiarise yourself with registers, memory space, program counters, assembly and machine code. A strong understanding of these concepts is essential to completing this assignment. Section 3.6 and 3.7 of the course textbook provide specific detail to x86_64 architecture, however you can review these as a reference.
In order to complete this assignment at a technical level you should revise your understanding of bitwise operations, file IO, pointers and arrays.
Some implementation details are purposefully left ambiguous; you have the freedom to decide on the specifics yourself. Additionally this description does not define all possible behaviour that can be exhibited by the system; some error cases are not documented. You are expected to gracefully report and handle these errors yourself.
You are encouraged to ask questions on Ed1 . Make sure your question post is of "Question" post type and is under "Assignment" category → "A2" subcategory. As with any assignment, make sure that your work is your own2 , and that you do not share your code or solutions with other students.
The Architecture
In this assignment you will implement a virtual machine for an 32-bit instruction-set. The memory mapped virtual components of your machine are outlined below:
COMP2017 COMP9017 • 0x0000 - 0x3ff: Instruction Memory - Contains 210 of bytes for text segment. • 0x0400 - 0x7ff: Data Memory - Contains 210 of bytes for global variables, and function stack. • 0x0800 - 0x8ff: Virtual Routines - Accesses to these address will cause special operations to be called. • 0xb700 +: Heap Banks - Hardware managed 128 x 64 bytes banks of dynamically allocate-able memory.
Your machine also has a total of 32 registers, as well as a PC (program counter) that points to the address of the current instruction in memory. Each of the general-purpose registers can store 4 bytes (32 bits) of data that can be used directly as operands for instructions. All registers are generalpurpose except for the first one, which has an address of 0. This register is called the zero register, as any read from it will return a value of 0. Writes to the zero register are ignored.
During execution you should not store any information about the state of the machine outside of the virtual memory devices and the register bank.
Note: A register stores a single value using a fixed bit width. The size of a register corresponding to the processor’s word size, in this case 32 bits. Think of them as a primitive variable. Physical processor hardware is constrained, and the number of registers is always fixed. There are registers which serve specific purposes, and those which are general. Please identify these in the description and consider them for your solution. You need not consider special purpose registers, such as floating point, in this assignment.
RISK-XVII Instruction-Set Architecture
An Instructions-Set Architecture(ISA) specifies a set of instructions that can be accepted and executed by the target machine. A program for the target machine is an ordered sequence of instructions. Our virtual machine will operate on a home-brewed ‘RISK-XVII’ instruction set architecture. During marking, you will be provided with binaries in this ISA to run on your virtual machine. RISKXVII is a reduced version of the well-known RV32I instruction set architecture, and your virtual machine should be able to execute binary programs compiled for RV32I, as long as they do not include instructions that were not specified by ‘RISK-XVII’.
There are in total 33 instructions defined in RISK-XVII, they can be classified into three groups by their functionality:
- Arithmetic and Logic Operations - e.g. add, sub, and, or, slt, sll
- Memory Access Operations - e.g. sw, lw, lui
- Program Flow Operations - e.g. jal, beq, blt
These instructions provide access to memory and perform operations on data stored in registers, as well as branching to different locations within the program to support conditional execution. Some instructions also contain data, i.e., an immediate number, within their encoding. This type of instruction is typically used to introduce hard-coded values such as constants or memory address offsets. The RISK-XVII instruction set is Turing complete and, therefore, can run any arbitrary program, just like your PC!
Instructions in the RISK-XVII instruction set architecture are encoded into 4 bytes of data. Since each instruction can access different parts of the system, six types of encoding formats were designed to best utilize the 32 bits of data to represent the operations specified by each instruction: R, I, S, SB, U, UJ. The exact binary format of each encoding type can be found in the table below.:
Let’s take a look at some common fields in all types of encoding: • opcode - used in all encoding to differentiate the operation, and even the encoding type itself. • rd, rs1, rs2 - register specifiers. rs1 and rs2 specify registers to be used as the source operand, while rd specifies the target register. Note that since there are 32 registers in total, all register specifiers are 5 bits in length. • func3, func7 - these are additional opcodes that specify the operation in more detail. For example, all arithmetic instructions may use the same opcode, but the actual operation, e.g. add, logic shift, are defined by the value of func3. • imm - immediate numbers. These value can be scrambled within the instruction encoding. For example, in SB, the 11st bit of the actual value was encoded at the 7th bit of the instruction, while the 12rd bit was encoded at the 31th bit.
An RISK-XVII program can be illustrated as below: [Instruction 1 (32 bits)] [Instruction 2 (32 bits)] [Instruction 3 (32 bits)] [...] [Instruction n (32 bits)]
RISK-XVII Instructions
We will now cover in detail all instructions defined in RISK-XVII. Pay close attention as your virtual machine need to be able to decode and execute all of these to be eligible for a full mark! You are encouraged to summarise them into a reference table before implementing your code.
Let’s use M to denote the memory space. M[i] denotes access to the memory space using address i. For example, to write an immediate value 2017 to the first word of data memory: M[0x400] = 2017. Similarly, we use R to denote the register bank, e.g. R[rd] = R[rs1] + R[rs2] denotes an operation that adds the value in rs1 and rs2, then store the result into rd.
Arithmetic and Logic Operations
1 add • Add - This instruction simply adds two numbers together. • Operation - R[rd] = R[rs1] + R[rs2] • Encoding: – Type: R – opcode = 0110011 – func3 = 000 – func7 = 0000000
2 addi • Add Immediate - Add a number from register with an immediate number. • Operation - R[rd] = R[rs1] + imm • Encoding: – Type: I – opcode = 0010011 – func3 = 000
3 sub • Subtract • Operation - R[rd] = R[rs1] - R[rs2] • Encoding: – Type: R – opcode = 0110011 – func3 = 000 – func7 = 0100000
4 lui • Load Upper Immediate - Load the upper part of an immediate number into a register and set the lower part to zeros. • Operation - R[rd] = {31:12 = imm | 11:0 = 0} • Encoding: – Type: U – opcode = 0110111
5 xor • XOR • Operation - R[rd] = R[rs1] ˆ R[rs2] • Encoding: – Type: R – opcode = 0110011 – func3 = 100 – func7 = 0000000
6 xori • XOR Immediate • Operation - R[rd] = R[rs1] ˆ imm • Encoding: – Type: I – opcode = 0010011 – func3 = 100 ......
Virtual Routines
Virtual routines are operations mapped to specific memory addresses such that a memory access operation at that address will have different effects. This can be used to allow programs running in the virtual machine to communicate with the outside world through input/output (I/O) operations. As part of your task to implement necessary I/O functions for your virtual machine, you are required to develop the following routines:
1 0x0800 - Console Write Character
A memory store command to this address will cause the virtual machine to print the value being stored as a single ASCII encoded character to stdout.
2 0x0804 - Console Write Signed Integer
A memory store command to this address will cause the virtual machine to print the value being stored as a single 32-bit signed integer in decimal format to stdout.
3 0x0808 - Console Write Unsigned Integer
A memory store command to this address will cause the virtual machine to print the value being stored as a single 32-bit unsigned integer in lower case hexadecimal format to stdout.
4 0x080C - Halt
A memory store command to this address will cause the virtual machine to halt the current running program, then output CPU Halt Requested to stdout, and exit, regardless the value to be stored.
5 0x0812 - Console Read Character
A memory load command to this address will cause the virtual machine to scan input from stdin and treat the input as an ASCII-encoded character for the memory load result. ......
Heap Banks
One of the biggest selling points of our RISK-XVII-based virtual machine is the hardware (virtual) enabled memory allocation support in addition to the two built-in static memory banks. Programs running inside the virtual machine can request more memory by interfacing with the allocation subsystem through virtual routines.
Your virtual machine program should manage memory allocation requests and ensure that the ownership of each block is always unique to malloc requests unless it is not used (free). You need to manage a total of 128 memory banks, each with 64 bytes. Each memory bank is a memory device that can be accessed as a linear array of bytes. To handle allocation requests larger than the size of a single bank, multiple consecutive banks need to be searched and allocated for the request. An error is returned if it is not possible to fulfill such a memory request. The mapped address of the initial byte of the first bank is 0xb700, and there are a total of 8192 bytes of memory that can be dynamically allocated.
Specification The below interfaces are defined for the virtual machine program: 1 0x0830 - malloc
A memory store command to this address will request a chunk of memory with the size of the value being stored to be allocated. The pointer of the allocated memory (starting address) will be stored in R[28]. If the memory cannot be allocated, R[28] should be set to zero.
2 0x0834 - free
A memory store command to this address will free a chunk of memory starting at the value being stored. If the address provided was not allocated, an illegal operation error should be raised.
Example Let’s consider a scenario where all 128 banks are not allocated yet (free), and we need to handle a malloc request with size 270. To fulfill the request, we need to find five free banks that are located consecutively, e.g., 64 + 64 + 64 + 64 + 14. We will set the first five blocks as used and return the address at the beginning of the first block: R[28] = &Block[0].
Now suppose another request just came in with size 12. We need only a single block to fulfill the request, and after a search the first one-consecutive block is the sixth. We will mark the sixth block as used and return the address to complete the allocation: R[28] = Block[5]. Hint: You are encouraged to use a linked list internally to store and maintain a record of the current allocation.
Error Handling
Register Dump
When an register dump was requested, the virtual machine should print the value of all registers, including PC, in lower case hexadecimal format to stdout. The output should be one register per line in the following format:
PC = 0x00000001; R[0] = 0xffffffff; R[1] = 0xffffffff; R[...] = 0xffffffff; R[31] = 0xffffffff
Not Implemented
If an unknown instruction was detected, your virtual machine program should output Instruction Not Implemented: and the hexadecimal value of the encoded instruction to stdout, followed by a Register Dump and terminate. For example, when if 0xffffffff were found in the instruction memory, your program should output:
Instruction Not Implemented: 0xffffffff
Illegal Operation
When an illegal operation was raised, your virtual machine program should output Illegal Operation: and the hexadecimal value of the encoded instruction to stdout, followed by a Register Dump and then terminate.
Any memory accesses outside of defined boundaries will cause this error to be raised. Memory accesses to not yet allocated, or freed, heap banks will also cause this error to be raised.
Starting the Virtual Machine
A binary file will be provided as an image of the instruction and data memory when running the program. This file can be found by opening the file path supplied as the first command line argument. The below C code outlines the format of the binary input file as a struct:
define INST_MEM_SIZE 1024
define DATA_MEM_SIZE 1024
struct blob { char inst_mem[INST_MEM_SIZE]; char data_mem[DATA_MEM_SIZE]; } All registers, including PC, will be initialised to zero at the initialisation of the virtual machine. During each cycle, the virtual machine should fetch and execute the instruction pointed by PC, and increase PC by 4. .....
Compilation and Execution
Your virtual machine program will be compiled by running the default rule of a make file. Upon compiling your program should produce a single vm_riskxvii binary. Your binary should accept a single argument in the form of the path to a ‘RISK-XVII‘ memory image binary file to execute. make
./vm_riskxvii
Please make sure the above commands will compile and run your program. An example Makefile has been provided in the Scaffold, but you’re encouraged to customize it to your needs. Additionally, consider implementing the project using multiple C source files and utilizing header files. Tests will be compiled and run using two make rules; make tests and make run_tests.
make tests make run_tests
These rules should build any tests you need, then execute each test and report back on your correctness. Failing to adhere to these conventions will prevent your markers from running your code and tests. In this circumstance you will be awarded a mark of 0 for this assignment.