CourseNana | COMP328: High Performance Computing - Continuous Assessment: Laplace solver

Continuous Assessment
CourseNana.COM

In this assignment, you are asked to implement a numerical method for solving a partial differential equation. This document explains the operation in detail, so you do not have to have studied calculus. You are encouraged to begin work on this as soon as possible to avoid the queue times on Barkla closer to the deadline. We would be happy to clarify anything you do not understand in this report. CourseNana.COM

1 Laplace solver CourseNana.COM

Modelling heat transfer in a room can be done by using the Laplace equation, a second-order partial differential equation. This can be approximated using a iterative stencil method. Consider this two-dimensional array which represents a 25m2 room CourseNana.COM

101010101010101010 10 101010101010101010 10 10 10 10 10 10 10 10 10 10 100 10 10 10 10 10 10 10 10 10 100 10 10 10 10 10 10 10 10 10 100 10 10 10 10 10 10 10 10 10 100 10 10 10 10 10 10 10 10 10 100 10 10 10 10 10 10 10 10 10 100 101010101010101010 10 101010101010101010 10 CourseNana.COM

Figure 1: An example of the room CourseNana.COM

This two-dimensional array represents the space in the room, where the dimensions are NxN. Each element in this array represents the temperature of that point within the room. The boundaries of the array represent the walls. The points equal to 100, represent a radiator within the room. The radiator always occupies 60% of the right wall and is centred. That is the radiator starts at t[N−1][floor((N−1)∗3)], and ends at t[N−1][ceil((N−1)∗0.7)] assuming 0 based indexing. Note that the room will always be 25m2. That means the number of the points only changes the resolution of the points in the room, not the actual size of the room. CourseNana.COM

To model how the heat from the radiator moves throughout the room, we use the following calculation for each point. CourseNana.COM

curr t [ i ][ j ]=AVERAGE(prev t[i][j+1]+prev t[i][j−1]+prev t[i+1][j]+prev t[i−1][j]) Figure 2: The iterative calculation to find the temperature moving through the room CourseNana.COM

That is, each point is equal to the average of the surrounding points. When applying this to
CourseNana.COM

Figure 1, we have these new temperatures. Figure 2 is 3 is after the second iteration. CourseNana.COM

1010101010101010 1010101010101010 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 1010101010101010 1010101010101010 CourseNana.COM

after the first iteration, and Figure CourseNana.COM

10 10 CourseNana.COM

10 10 32.5 100 32.5 100 32.5 100 32.5 100 32.5 100 32.5 100 10 10 10 10 CourseNana.COM

Figure 3: An example of the room after one iteration CourseNana.COM

10101010101010 10101010101010 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10101010101010 10101010101010 CourseNana.COM

10 10 10 CourseNana.COM

10 15.625 10 15.625 38.125 100 15.625 43.75 100 15.625 43.75 100 15.625 43.75 100 15.625 43.75 100 15.625 38.125 100 CourseNana.COM

10 15.625 10 10 10 10 CourseNana.COM

Figure 4: An example of the room after two iterations CourseNana.COM

We can observe that we do not update the boundary elements, in order to avoid any access to memory that does not exist. To update the array, only the elements within the range of indices from the second row to the second-to-last row (rows 1 to N-2) and from the second column to the second-to-last column (columns 1 to N-2) should be modified. CourseNana.COM

1.1 OpenMP Implementation CourseNana.COM

You are asked to implement this operation in a C function with the following signature. This function should be saved in a file called heat.c CourseNana.COM

double ∗get final temperatures(int N, int maxIter, double ∗radTemps, int numTemps){ CourseNana.COM

double ∗results = (double∗)malloc(numTemps ∗ sizeof(double); for(int temps = 0; temps < numTemps; temps++){ CourseNana.COM

// . . . yourCodeHere CourseNana.COM

int pointx = floor ((N−1)∗0.5);
int pointy = floor ((N−1)∗0.5);
double result = curr t [ pointx ][ pointy ]; results [temp] = result ; CourseNana.COM

} CourseNana.COM

return results ; CourseNana.COM

} CourseNana.COM

N is the number of points along one axis of the room matrix and maxIter is the maximum number of iterations to be performed in one run. curr t[pointx][pointy] is the centre of the room after the final iteration has been performed. radtemps is an array of temperatures for the radiator to be set and numTemps is the number of temperatures to test. The for loop should loop through every temperature in the radTemps array and test how the heat dissipates through an unheated room. Therefore the function returns the temperatures of the centre of the rooms for different radiator temperatures. CourseNana.COM

1.2 Serial Implementation CourseNana.COM

You are asked to implement a sequential main C file which can do the following. CourseNana.COM

Read a string of radiator temperatures from an input file and store it in an one- dimensional array. CourseNana.COM
Call the ‘get final temperature()‘ function CourseNana.COM
Store the results in a one-dimensional array. CourseNana.COM
Write the results to an output file. CourseNana.COM

Once compiled, the sequential program should be called like so: CourseNana.COM

$ ./<program> <N> <max iter> <input file name> <output file name> CourseNana.COM

Where <program> is the executable, <N> is the size of N (defines the number of points in the room), <max iter> is the number of iterations to be performed for each radiator temperature, <input file name> is the name of the input file, and <output file name> is the name of the output file. CourseNana.COM

1.3 Distributed implementation CourseNana.COM

This time, you are asked to implement a distributed main c file which can perform the same functionality of the serial implementation, but distribute the radiator temperatures that are read from the input file between MPI ranks. CourseNana.COM

Once compiled, the distributed program should be called like so:
$ mpirun −np <num ranks> ./<program> <N> <max iter> CourseNana.COM

<input data file> <output data file>
Where <num ranks> is the number of MPI processes. The other arguments given here CourseNana.COM

are the same as those explained for the serial version. CourseNana.COM

1.4 Data file format CourseNana.COM

The first line of the input file has an integer. This integer defines the number of temperatures in the file. The second line of the data file is a space-separated list of all the radiator temperature values. The output file will follow the same format. CourseNana.COM

The input file name follows: input K.dat where K is the number of values. The output file name follows: output K N maxIter where K is number of values, and N and maxIter are the arguments explained above. CourseNana.COM

You are provided with the code for reading from the input and kernel data files and to write to the output data file. These are located in file-reader.c. You should use this file when compiling. CourseNana.COM

2023-2024 4 CourseNana.COM

Instructions
Implement a multi-threaded laplace solver using OpenMP. Save it in a file called heat.c. CourseNana.COM

Modify the main-serial.c file so that it reads from the input data file, calls your OpenMP stencil function, and writes to the output data file. Use the output to make sure your implementation is correct. Ensure your code is saved as main-serial.c. CourseNana.COM

Modify the main-mpi.c file so that it performs the same functionality as main-serial.c but distributes the radiator temperatures over multiple MPI processes. Ensure it is saved as main-mpi.c. CourseNana.COM

Write a Makefile that includes instructions to compile your programs. Your MakeFile should work like so: CourseNana.COM

– make gccserial - compiles ‘main-serial.c‘, ‘heat.c‘ and ‘file-reader.c‘ into ‘heat-omp-gcc‘ with the GNU compiler (gcc) CourseNana.COM
– make gcccomplete - compiles ‘main-mpi.c‘, ‘heat.c‘ and ‘file-reader.c‘ into ‘heat-complete-gcc‘ with the GNU mpi compiler (mpicc) CourseNana.COM
– make iccserial - compiles ‘main-serial.c‘, ‘heat.c‘ and ‘file-reader.c‘ into ‘heat-omp-icc‘ with the Intel compiler (icc) CourseNana.COM
– make icccomplete - compiles ‘main-mpi.c‘, ‘heat.c‘ and ‘file-reader.c‘ into ‘heat-complete-icc‘ with the Intel mpi compiler (mpiicc) CourseNana.COM

Try running your program for 1, 2, 4, 8, 16 and 32 OpenMP threads, measuring the time taken in each instance. Use this to plot a speedup plot with speedup on the y-axis and the number of threads on the x-axis. CourseNana.COM

Test the fastest running instance (up to 8 threads) over 1, 2, 4, 8, 16 and 32 ranks i.e. if you found that 4 OpenMP threads was the fastest, test this with 1, 2, 4, 8, 16, 32 ranks. Use this to draw a strong-scaling plot with time on the y-axis and the number of ranks on the x-axis. CourseNana.COM

– See Lecture08 on how to submit a job across multiple nodes. CourseNana.COM
– The maximum number of nodes you will need for 8 OpenMP threads and 32 MPI CourseNana.COM

ranks is 8 nodes. CourseNana.COM
– You will potentially have to wait hours/potentially a few days if you submit to multiple nodes. If there is little time until the deadline, test with 1 OpenMP thread up to 32 ranks on the course node. CourseNana.COM

Using up to one page for each code you produced (not including images), write a report
CourseNana.COM

that describes: CourseNana.COM

– your implementation and parallel strategy for heat.c, and its speedup plot CourseNana.COM
– your implementation for main-serial.c CourseNana.COM
– your implementation and parallel strategy for main-mpi.c, and its strong-scaling plot CourseNana.COM
– for each plot, how you measured and calculated it, including a table with your times and why your program achieved a linear speedup/reduction in time or not CourseNana.COM

Include a screenshot of compiling and running your program, making sure your user- name is visible. CourseNana.COM

Your final submission should include: CourseNana.COM
1. heat.c - the parallel implementation using OpenMP. CourseNana.COM
2. main-serial.c - a main function that calls the function defined in heat.c to perform the operations described above CourseNana.COM
3. main-mpi.c - the complete implementation using OpenMP and MPI. CourseNana.COM
4. Makefile - a MakeFile that can compile 4 different programs. The instructions CourseNana.COM
  
  for this are given above. CourseNana.COM
5. Report.pdf - a pdf file containing the plots, descriptions, and screenshots. CourseNana.COM
6. The slurm script you used to run your code on Barkla. CourseNana.COM
This assignment should be uploaded on Codegrade, following the instruc- tions present there. CourseNana.COM

• Failure to follow any of the above instructions is likely to lead to reduction in scores. CourseNana.COM

3 Hints
CourseNana.COM

If you get any segmentation faults when running your program, use a tool called gdb to help debug. Read its manual to understand how to use it. CourseNana.COM

Make sure to test your code with small as well as big matrices. CourseNana.COM

Ensure that you are not printing the room temperatures when doing the large test. This will greatly affect your runtime, especially for the large files. CourseNana.COM

The memory movement of copying curr t into prev t at the end of every iteration can have big consequences on the time it takes for the program to run. It would be more efficient if you had a 3-D array t[2][N][N]. You could switch between t[0][N][N] and t[1][N][N] depending which is the current or previous iteration. CourseNana.COM

If your sequential run for N = 256 maxIter = 4096 with input 1024.dat is taking longer than 10 minutes, reconsider your strategy. CourseNana.COM

3.1 MakeFile CourseNana.COM

You are instructed to use a MakeFile to compile the code in any way you like. An example of how to use a MakeFile can be used here: CourseNana.COM

{make command}: {target files} {compile command} CourseNana.COM

gccserial: heat.c main−serial.c file−reader.c
gcc −fopenmp heat.c main−serial.c file−reader.c −o heat−omp−gcc −lm CourseNana.COM

Now, on the command line, if you type ‘make gccserial‘, the compile command is automati- cally executed. It is worth noting, the compile command must be indented. The target files are the files that must be present for the make command to execute. CourseNana.COM

This command may work for you and it may not. The point is to allow you to compile however you like. If you want to declare the iterator in a for loop, you would have to add the compiler flag −std=c99. −fopenmp is for the GNU compiler and −qopenmp is for the Intel Compiler. If you find that the MakeFile is not working, please get in contact as soon as possible. CourseNana.COM

CourseNana.COM

COMP328: High Performance Computing - Continuous Assessment: Laplace solver - Partial differential equation

Get in Touch with Our Experts