CourseNana | CSC3050 Computer Architecture - Project 2: RISC-V CPU Pipeline Simulation

RISC-V CPU Pipeline Simulation CourseNana.COM

RISC-V is an open-source architecture and instruction set standard originating from Berkeley. This project requires you to implement a RISC-V CPU pipeline simulator based on the standard five-stage pipeline. You will need to implement a subset of the instructions from the RV32I instruction set specified in RISC-V Specification 2.2. Implementing a complete CPU simulator can effectively exercise system programming capabilities and deepen understanding of architecture-related knowledge. CourseNana.COM

2. Project introduction CourseNana.COM

2.1. Project requirements CourseNana.COM

The most important part of this project is to implement a RISC-V CPU pipeline simulator. The specific requirements are as follows: CourseNana.COM

command-line argument parser module that allows for parsing paths to RISC-V binary CourseNana.COM

files specified in the command line. It also provides an option to enable or disable printing a history log at the end of the file. Please make sure your simulator can be run by Simulator xxx.riscv, where xxx.riscv is the path of the riscv binary code. CourseNana.COM
load ELF files (has implemented in templates). CourseNana.COM
history module (has a reference structure in templates). CourseNana.COM
memory management (has a reference structure in templates, is needed by some CourseNana.COM

instruction). CourseNana.COM
simulate the required instructions (see Section 2.7), including handle the system call (see CourseNana.COM

Section 2.6). CourseNana.COM
handle data hazard, control hazard and memory access hazard. CourseNana.COM

2.2. Possible Structure CourseNana.COM

NOTE: This is just a possible structure. You can design your own structure. CourseNana.COM

The overview diagram (Figure 1) of the simulator code architecture is shown below. The entry point of the simulator is Main.cpp, which includes parsing parameters, loading ELF files, initializing the simulator module, and finally calling the simulate() function to enter the execution of the simulator. Unless there is an error in executing the simulator, theoretically, simulate() function will not return. CourseNana.COM

The simulator itself is designed as a large class, which is the class The data in the Simulator class includes PC, general registers, pipeline registers, execution history recorders, memory modules and branch prediction modules (not necessary for you, will not affect your score). Among them, because the memory module and branch prediction module are relatively independent, they are implemented as two separate classes MemoryManager and BranchPredictor. CourseNana.COM

The most core function in the simulator is the simulate() function, which performs cycle- level simulation on the simulator. In each simulation, it will execute fetch(), decode(), execute(), accessMemory() and writeBack() five functions, each of which takes as input the CourseNana.COM

pipeline register from the previous cycle and outputs to the pipeline register for the next cycle. At the end of a cycle, contents of new registers are copied into those used as inputs. During execution, each function handles content related to data hazards, control hazards and memory access hazards and records historical information at appropriate places. CourseNana.COM

Figure 1: Simulator Architecture CourseNana.COM

2.3. Memory Management CourseNana.COM

The function of MemoryManager is to provide a simple and easy-to-use memory access interface for the simulator, which must support arbitrary memory size and memory address access, and can detect illegal memory address access. What you need to do is to load the different sections from the elf file into the correct memory locations based on the section’s virtual memory address and memory size (not file size). Then, when simulating the execution of read/write instructions, you just need to parse the memory address and directly operate with this memory address in the MemoryManager, without the need for any conversions in between. CourseNana.COM

The following implementation of MemoryManager uses a mechanism similar to the two- level page table (single-level page table is OK) used in x86 architecture. Specifically speaking, it divides the 32-bit memory space (4GB) logically into pages with a size of 4KB (2^12), using the first 10 bits of the memory address as an index for level one page table, followed by another 10 bits as an index for level two page table, and finally using last 12 bits as an offset within a single page. CourseNana.COM

). CourseNana.COM

2.4. ELF Load and Initialization CourseNana.COM

You need to according to Section 3 to implement the ELF file loader and initialize the simulator. The ELF file loader is responsible for loading the ELF file into the simulator’s memory, and the initialization process is responsible for setting the initial state of the simulator, including setting the initial value of the PC, setting the initial value of the general register, and setting the initial value of the stack pointer, etc. CourseNana.COM

NOTE: This is a sample memory manager. You can create your own design. But you have to CourseNana.COM

make sure that your ELF file is loaded into the correct position. With the above CourseNana.COM

implementation, you don’t need to allocate 4GB of memory at once; you only need to CourseNana.COM

allocate as needed. Of course, you can also directly allocate the memory required for loading CourseNana.COM

the ELF file along with an additional stack area (Figure 2 CourseNana.COM

Figure 2: Memory Layout CourseNana.COM

Figure 2 shows the typical layout of a simple computer’s program memory with the text, various data, and stack and heap sections. The text and data segments are placed in their corresponding positions when you load the ELF file. After loading the ELF file, initialize the stack by setting the stack pointer to the top of memory and adjusting the stack size as needed, for example, to 4MB. Heap management is typically handled by software, so you don’t need to worry about it. CourseNana.COM

2.5. Simulator Implementation CourseNana.COM

For the RISC-V pipeline simulator, you need to implement the five-stage pipeline, including CourseNana.COM

Fetch, all instructions in the RV32I instruction set are fixed-length 4 bytes. CourseNana.COM
Decode, translates instructions into RISC-V assembly format strings. In addition, mimics CourseNana.COM

hardware implementations by abstracting common fields such as op1, op2, and dest from CourseNana.COM

instructions. CourseNana.COM
Execute, simply executes corresponding behaviors based on different types. In conclusion, CourseNana.COM

it checks data hazards, control hazards, and memory access hazards according to the current commands and situations during the decode stage, and takes actions accordingly. At this point, jump command gets whether or not jump happens, and inserts bubbles into pipeline registers when branch to wrong path. CourseNana.COM
Memory access, performs memory read-write operations,and detects data hazard and forwarding. When detecting data hazard, it needs consider both general data hazard and situation where pipeline stalls due to memory access hazard last cycle. Besides, priority level for forwarding must also taken into account. CourseNana.COM
Write back, writes execution results back to register,and handles data hazard like before. CourseNana.COM

For the RISC-V Pipeline Hazards, I recommend you to refer to the following links: • RISCV-V Pipeline Hazards from Berkeley
• RISCV-V Pipeline Hazards from Washington CourseNana.COM

2.6. System Call CourseNana.COM

This project use following system calls. The system call use ecall instruction to trigger. And it will use a0 and a7 register to pass the system call number and return value. CourseNana.COM

System Call Name CourseNana.COM

System Call Number CourseNana.COM

Parameter CourseNana.COM

Return Value CourseNana.COM

Print string CourseNana.COM	0 CourseNana.COM	The initial address of string CourseNana.COM	None CourseNana.COM
Print char CourseNana.COM	1 CourseNana.COM	The value of char CourseNana.COM	None CourseNana.COM
Print number CourseNana.COM	2 CourseNana.COM	The value of number CourseNana.COM	None CourseNana.COM
Exit program CourseNana.COM	3 CourseNana.COM	None CourseNana.COM	None CourseNana.COM
Read char CourseNana.COM	4 CourseNana.COM	None CourseNana.COM	The value of char CourseNana.COM
Read number CourseNana.COM	5 CourseNana.COM	None CourseNana.COM	The value of number CourseNana.COM

The detailed information about system call can be found in test-release/lib.c. CourseNana.COM

2.7. Required Instructions CourseNana.COM

The following table lists the instructions that you need to implement in the simulator. You can refer to the RISC-V Specification 2.2 for the detailed information about these instructions. CourseNana.COM

"lui",  "auipc", "jal",   "jalr",  "beq",   "bne",  "blt",  "bge",  "bltu",
"bgeu", "lb",    "lh",    "lw",    "ld",    "lbu",  "lhu",  "sb",   "sh",
"sw",   "sd",    "addi",  "slti",  "sltiu", "xori", "ori",  "andi", "slli",
"srli", "srai",  "add",   "sub",   "sll",   "slt",  "sltu", "xor",  "srl",
"sra",  "or",    "and",   "ecall"

2.8. History CourseNana.COM

The simulator needs to record the number of cycles and the number of instructions executed during the simulation process, and output the number of cycles and the number of instructions executed when the input parameters indicate that these need to be printed. CourseNana.COM

2.9. Advanced Features CourseNana.COM

You can implement the following advanced features to improve the simulator: CourseNana.COM

Implement a branch prediction module to improve the performance of the simulator. CourseNana.COM
Implement a cache module to improve the performance of the simulator. This will be CourseNana.COM

related to the next project. CourseNana.COM
Implement a out-of-order execution module to improve the performance of the simulator. CourseNana.COM
Some other advanced features that you are interested in. CourseNana.COM

NOTE: This part is not tested by the test scripts, but you need to implement it and provide CourseNana.COM

the usage of it in your ReadMe.md. Please make sure your ReadMe is clear and detailed. CourseNana.COM

NOTE: These advanced features are not necessary for this project, and they will not affect CourseNana.COM

your score. If you have interest, you can implement them. CourseNana.COM

2.10. Test Cases CourseNana.COM

We provide some test cases for you to verify your simulator. You can find them in the test- release directory. We also have other programs to further verify your simulator, all these test cases will be part of your final score. CourseNana.COM

How to run the test cases: CourseNana.COM

Download the test-release.zip file from the course platform. CourseNana.COM
Unzip the test-release.zip file in the root directory of your project, you will get test- CourseNana.COM

release directory and run-test-release.sh. CourseNana.COM
Run the run-test-release.sh script in the root directory of your project, like bash run- CourseNana.COM
```
  test-release.sh
```
build CourseNana.COM

The example output: CourseNana.COM
```
> bash run-test-release.sh
Comparing ./test-release/add.out and ./test-release/add.ref
Succeed! Files ./test-release/add.out ./test-release/add.ref are the same
```
```
Comparing ./test-release/mul-div.out and ./test-release/mul-div.ref
Succeed! Files ./test-release/mul-div.out ./test-release/mul-div.ref are the
same
```
```
Comparing ./test-release/n!.out and ./test-release/n!.ref
Succeed! Files ./test-release/n!.out ./test-release/n!.ref are the same
```
```
Comparing ./test-release/qsort.out and ./test-release/qsort.ref
Succeed! Files ./test-release/qsort.out ./test-release/qsort.ref are the same
```
```
Comparing ./test-release/simple-function.out and ./test-release/simple-
function.ref
Succeed! Files ./test-release/simple-function.out ./test-release/simple-
function.ref are the same
```
5 / 5 tests pass!
3. ELF File Loader CourseNana.COM

3.1. ELF File Format CourseNana.COM

There are three main types of object files in the ELF (Executable and Linking Format) format:
• Relocatable file: holds code and data suitable for linking with other object files.
• Executable file: holds a program suitable for execution. CourseNana.COM

• Shared object file: holds code and data suitable for linking in two contexts. CourseNana.COM

Object files participate in program linking (building a program) and program execution (running a program). For convenience and efficiency, the object file format provides parallel views of a file’s contents, reflecting the differing needs of these activities. Figure 3 shows the basic structure of an ELF object file. CourseNana.COM

Please make sure your executable file is named CourseNana.COM

and is located in the CourseNana.COM

directory. CourseNana.COM

Simulator CourseNana.COM

Figure 3: Object File Format CourseNana.COM

Section in object file format: CourseNana.COM

Sections are used during the linking and compilation process CourseNana.COM
They represent different types of data within the ELF file, such as code (.text), initialized CourseNana.COM

data (.data), uninitialized data (.bss), symbols table (.symtab), string table (.strtab), CourseNana.COM

relocation information (.rel.text, .rel.data), and debugging information. CourseNana.COM
Sections contain information that is useful for linking and for debugging, but they are not CourseNana.COM

necessarily loaded into memory when the program is executed. CourseNana.COM
The ELF file contains a section header table that lists all sections and their attributes. CourseNana.COM

Segment in object file format:
• Segments are used during the execution process.
• They are typically a collection of sections that need to be loaded into memory as a unit. CourseNana.COM

In summary, sections are for organization and use during compilation and linking, while segments are for mapping the ELF file into memory during execution. An object file segment contains one or more sections, as “Segment Contents”. CourseNana.COM

3.2. Program Loading CourseNana.COM

As the system creates or augments a process image, it logically copies a file’s segment to a virtual memory segment. Virtual addresses and file offsets for SYSTEM V architecture segments are congruent modulo 4KB (0x1000) or larger powers of 2, which means when you divide the virtual address and the file offset by 4KB, the remainders are the same. Because 4KB is the maximum page size, the files will be suitable for mapping regardless of physical page size. Figure 4 shows the basic structure of an ELF executable file. CourseNana.COM

Figure 4: Executable File CourseNana.COM

Although the example’s file offsets and virtual addresses are congruent modulo 4KB for both text data, up to four file pages hold impure text or data (depending on page size and file system block size).
• The first text page contains the ELF header, the program header table, and other info. CourseNana.COM

• The last text page holds a copy of the beginning of data.
• The first data page has a copy of the end of text.
• The last data page may contain file information not relevant to the running process. CourseNana.COM

Figure 5: Process Image Segments CourseNana.COM

Logically, the system enforces the memory permissions as if each segment were complete and separate; segments’ addresses are adjusted to ensure each logical page in the address space has a single set of permissions. In the example (Figure 4) above, the region of the file holding the end of text and the beginning of data will be mapped twice; at one virtual address for text and at a different virtual address for data. CourseNana.COM

The end of the data segment requires special handling for uninitialized data (often referred to as the .bss segment (Block Started by Symbol), is a portion of the memory in a program that is reserved for variables that have not been given an explicit initial value by the programmer.), which the system defines to begin with zero values. Thus if a files’s last data page includes information not in the logical memory page, the extraneous data must be set to zero, not the unknown contents of the executable file. “Impurities” in the other three pages are not logically part of the process image; whether the system expunges them is unspecified. The memory image (Figure 5) for this program follows, assuming 4KB (0x1000) pages. CourseNana.COM

3.3. Program Loading Example CourseNana.COM

Here is an example of loading an ELF file into memory. The following is the output of the simulator when loading the add.riscv file. You need to allocate memory for segments CourseNana.COM

In this project, you do not need to care about “Impurities” in the pages. Just deal with the CourseNana.COM

uninitialized data. CourseNana.COM

according to MSize (memory size). For address larger than FSize (file size), you need to fill the memory with 0. CourseNana.COM

> ./Simulator ../test-inclass/add.riscv -s -v
==========ELF Information==========
Type: ELF32
Encoding: Little Endian

ISA: RISC-V(0xf3)
Number of Sections: 14

ID  Name
[0]
[1] .text
[2] .eh_frame
[3] .init_array
[4] .fini_array
[5] .data

Address Size
      0x0 0

[6] .sdata
[7] .sbss
[8] .bss
[9] .comment
[10]  .riscv.attributes 0x0 28

0x100e8 8636
0x13000 4
0x13008 16
0x13018 8
0x13020 2472
0x139c8 32
0x139e8 56
0x13a20 1416
0x0 45

0x0 4632
0x0 1478
0x0 118

[11]  .symtab
[12]  .strtab
[13]  .shstrtab
Number of Segments: 3
ID  Flags Address FSize MSize
[0] 0x4 0x0 28  0

[1] 0x5 0x10000 8868  8868
[2] 0x6 0x13000 2536  4008
===================================
Memory Pages:
0x0-0x400000:

  0x10000-0x11000
  0x11000-0x12000
  0x12000-0x13000
  0x13000-0x14000

Fetched instruction 0x00003197 at address 0x1012c

3.4. Some other information CourseNana.COM

For entry point, you can find it in ELF header (e_entry in ELF header). It gives the virtual address to which the system first transfers control, thus starting the process. If the file has no associated entry point, it holds zero. CourseNana.COM

You have the option to create your own ELF file loader or utilize existing libraries like elfio. Your choice won’t affect your score, but I recommend writing it yourself for a better understanding of the ELF file format and program loading process. CourseNana.COM

For more information about the ELF file format, you can refer to the following links: cmu-elf CourseNana.COM

NOTE: We have provided a sample ELF file loader. You may use it as is or modify it to suit CourseNana.COM

your needs. Please ensure you understand it before using. CourseNana.COM

NOTE: All ELF files used for testing are little-endian. Ensure to manage the file’s endianness CourseNana.COM

accordingly. CourseNana.COM

4. Submission CourseNana.COM

For this project, you must use C/C++/Rust to implement the simulator. If you use python, you will get a 0 score. You need to submit the following files: CourseNana.COM

src/*, include all source code files CourseNana.COM
include/*, include all header files if you use C/C++ CourseNana.COM
CMakelists.txt, the cmake file for your project if you use C++/C CourseNana.COM
Cargo.toml, the cargo file for your project if you use Rust CourseNana.COM
ReadMe.md, a brief introduction to your project, including the usage of your simulator, CourseNana.COM

the implementation details of your simulator, the history information of your simulator, CourseNana.COM

and how to compile and run your project. CourseNana.COM
test-release/*, include all test cases provided by us, do not change the file name CourseNana.COM
build.sh, a script to build your project which should be able to compile your project just CourseNana.COM

by running bash build.sh
Please compress all files into a single zip file and submit it to the course platform. The file CourseNana.COM

name should be your student ID, like xxxxxxxxx.zip.
Please ensure that your emulator can be compiled by cmake with gcc/ CourseNana.COM

g++ or cargo. If you use other tools, it is not acceptable. CourseNana.COM

5. Grading
CourseNana.COM

For this assignment, we are to submit a RISC-V CPU pipeline simulator. If you have difficulty completing this, you may submit a sequential version of the simulator; however, you will receive a maximum of 30% of the score. CourseNana.COM

The overall score will be calculated as follows: • Not provided test cases: 45%
• Provided test cases: 25%
• History (like Section 2.8): 10% CourseNana.COM

• ReadMe.md: 10%
• Code style and comments: 10% • Advanced features (bonus): 5% CourseNana.COM

Some matters need attention: CourseNana.COM

The code should be well-structured and easy to understand. CourseNana.COM
The ReadMe.md should be clear and easy to understand. Please provide detailed CourseNana.COM

introduction about your simulator, including the usage of your simulator, the implementation details of your simulator, the history information of your simulator, how to compile and run your project, and other information that you consider important. CourseNana.COM

Please make sure your project can be CourseNana.COM

compiled and run on the Linux platform. If your project cannot be compiled and run, you CourseNana.COM

will receive a 0 score. CourseNana.COM

Do not plagiarize. If we discover that you have plagiarized, you will not only receive a score of zero for this project, but you will also fail this course directly. Additionally, we will report your actions to the Registry office. If we use plagiarism-detection software and after confirmation by the TA, it is found that you have indeed plagiarized, we will notify you via email. CourseNana.COM
Please ensure that your project can pass the aforementioned test scripts (Section 2.10); we will provide you with some example tests. If your project does not yield the expected output, it will initially receive a score of zero. Following that, you may contact us with your test scripts, but the score for the part of your project pertaining to code style and comments will directly be 0. CourseNana.COM

6. Development Environment
6.1. RISC-V Environment Installation and Configuration CourseNana.COM

For convenience, this experiment is entirely based on the RISC-V 32I instruction set, with reference to the RISC-V Specification 2.2 standard. CourseNana.COM

The following steps were taken to configure the environment: CourseNana.COM
- Downloaded riscv-tools from GitHub and configured, compiled and installed riscv-gnu- CourseNana.COM
  
  toolchain for Linux platform CourseNana.COM
- To use official simulator as a reference, downloaded, compiled and installed riscv-qemu CourseNana.COM
  
  from GitHub; CourseNana.COM

It should be noted that when compiling riscv-gnu-toolchain, it is necessary to specify that the tool chain and C language standard library use RV32I instruction set. Otherwise during compilation compiler will use extended instruction sets like RV32C、RV32D etc., even if compiler settings are made only for using RV32I instructions during compile time compiler would still link in standard library functions which uses extended instructions sets. CourseNana.COM

Therefore in order to get ELF program which only uses RV32I standard instructions one must recompile within riscv-gnu-toolchain with following options: CourseNana.COM

mkdir build; cd build
../configure --with-arch=rv32i --prefix=/path/to/riscv32i
make -j$(nproc)

During compilation, use -march=rv32i to let the compiler generate ELF programs for the RV32I standard instruction set: CourseNana.COM

riscv32-unknown-elf-gcc -march=rv32i add.c lib.c -o add.riscv Dissasemble the ELF program use following command: riscv32-unknown-elf-objdump -D add.riscv > add.s CourseNana.COM

CSC3050 Computer Architecture - Project 2: RISC-V CPU Pipeline Simulation

Get in Touch with Our Experts