Homepage
Exam
[2022] COMPSCI 711: Parallel and Distributed Computing - Final Exam - Q7 Instruction Pipeline

[2022] COMPSCI 711: Parallel and Distributed Computing - Final Exam - Q7 Instruction Pipeline

This question has been solved

Engage in a Conversation

Question 7

Consider the following code fragment which "folds" a linear array xa. CourseNana.COM
```
 const int MAXSIZE = 20000;
 double fold, xa[MAXSIZE];
 ...
 for (i = 0; i < MAXSIZE / 2; ++i) {
     fold += xa[i] * xa[MAXSIZE - 1 - i];
 }
```
This code is to be executed on a superscalar processor with 4 functional units. The processor is capable of doing one floating-point operation and one integer operation per clock cycle. One of these operations could be a memory access (a load or a store). The integer oepration could be a branch. The following table illustrate the latency and repeat rate of some instructions supported by the processor CourseNana.COM

Instruction Latency Repeat rate
Integer add/sub/logical/branch 1 1
Integer load/store 2 1
Floating-point load/store 3 1
Floating-point add/sub/multiply 2 1
Floating-point multiply-add 3 1
Floating-point division 19 2
1. Write down the instruction involved in the execution of the loop, and the instruction schedule for one iteration. Give the flops per cycle for the schedule, and compute the functional unit utilization CourseNana.COM
2. Unroll the loop once, and schedule the instruction. Give the flops per cycle for the schedule, and compute the functional unit utilization CourseNana.COM
3. Unroll the loop twice, and schedule the instruction. Give the flops per cycle for the schedule, and compute the functional unit utilization CourseNana.COM
  
  Solution CourseNana.COM

Instruction	Latency	Repeat rate
Integer add/sub/logical/branch	1	1
Integer load/store	2	1
Floating-point load/store	3	1
Floating-point add/sub/multiply	2	1
Floating-point multiply-add	3	1
Floating-point division	19	2

Get the Solution to This Question

WeChat (微信)

Last: [2022] COMPSCI 711: Parallel and Distributed Computing - Final Exam - Q6 Pros and Cons of Parallel HTTP

Next: [2022] COMP5426 Parallel and Distributed Computing - Final Exam Q1 Parallel Matrix Multiplication Algorithm

University of Auckland代写,COMPSCI 711代写,Parallel and Distributed Computing代写,University of Auckland代编,COMPSCI 711代编,Parallel and Distributed Computing代编,University of Auckland代考,COMPSCI 711代考,Parallel and Distributed Computing代考,University of Aucklandhelp,COMPSCI 711help,Parallel and Distributed Computinghelp,University of Auckland作业代写,COMPSCI 711作业代写,Parallel and Distributed Computing作业代写,University of Auckland编程代写,COMPSCI 711编程代写,Parallel and Distributed Computing编程代写,University of Aucklandprogramming help,COMPSCI 711programming help,Parallel and Distributed Computingprogramming help,University of Aucklandassignment help,COMPSCI 711assignment help,Parallel and Distributed Computingassignment help,University of Aucklandsolution,COMPSCI 711solution,Parallel and Distributed Computingsolution,