Homepage
Programming
Concurrent Programming Assignment: Connected components problem on undirected graphs

Concurrent Programming Assignment: Connected components problem on undirected graphs

Engage in a Conversation

In this assignment you will create two concurrent implementations of connected components problem on undirected graphs. The goal of this assignment is to utilise multiple processor cores and accelerate the computation. You will also provide explanations of why these algorithms are correct using the theory of concurrent programming. CourseNana.COM

Question 1: Analysis (using PageRank) [10 marks] CourseNana.COM

The edgemap method is the main place in the code where time is spent. According to Amdahl's Law, this is where you should focus your attention to obtain a speedup. We can distribute the work performed by the edgemap method over multiple threads using "DOALL" parallelisation: as the method applies a "relax" method to all edges, each thread calls the "relax" method for a subset of the edges. To this end, the set of edges is partitioned with one partition assigned to each thread, which will call the "relax" method on all edges in the partition assigned to it. CourseNana.COM

Briefly answer the following questions using a sequential implementation of the program: CourseNana.COM

A. Measure the duration of the edgemap method (which will become the parallel part) and the remainder of the program (which will remain sequential) using the CSC format. Perform the measurements on the PageRank problem using the orkut graph. Using Amdahl's Law, calculate what speedup may be achieved on 4 threads? What speedup may be achieved on 1,000 (one thousand) threads? CourseNana.COM

B. Assume that the iterations of the outermost loop in the edgemap method will be distributed over multiple threads. Describe the race conditions that will occur when using the CSR, CSC and COO representations, if any. Hint: Processing an edge implies reading and writing values associated to the endpoints of the edge (source and destination vertices). Check the relax method in PageRankjava to identify what read and write operations are executed for source vertices and for destination vertices. Will these reads and writes result in race conditions for each of CSR, CSC, and COO? CourseNana.COM

C. Based on your analysis to answer the above questions, and what you have learned in the first assignment, what graph format would you use in a concurrent program and why? CourseNana.COM

CourseNana.COM

Question 2: Parallel Edgemap (PageRank and Connected Components) [10 marks] CourseNana.COM

Implement a concurrent version of the power iteration step using the CSC sparse matrix format. Modify the program to: CourseNana.COM

1. Create a configurable number of threads. The number of threads that should be created is passed in as an argument to the program's main method. CourseNana.COM

2. Create a version of the edgemap method for the CSC format with signature public void ranged_edgemap( Relax relax, int from, int to ), where the outer loop will iterate only over the range from "from" to "to" ("to" not inclusive). CourseNana.COM

3. Distribute the iterations of the outer loop of the edgemap method over the threads by determining the start and end vertex IDs for each thread. Keep in mind that the number of vertices may not be a multiple of the number of threads. For instance, if there are 17 vertices and 3 threads, then a first thread should iterate over the vertices 0,...,5, the second thread should iterate over 6,...,11 and the final thread should iterate over 12,...,16. CourseNana.COM

4. Run all threads concurrently and make them call the newly defined edgemap method on their designated vertices. CourseNana.COM

5. Wait for all threads to complete by joining them. CourseNana.COM

To help you with these changes, an abstract class ParallelContext has been defined in the provided source code¹ with the purposes of capturing the creation of threads and edge subsets in a way that is independent of the algorithm (PageRank/ConneetedComponents) and graph data structure. The ParallelContext is set by the Driver at program startup and used by the PageRank and ConnectedComponents classes to iterate the edges of the graph. A ParallelContextSingleThread is provided that merely calls edgemap on the appropriate graph, as per assignment 1. A place holder is provided for ParallelContextSimple where you can extend the edgemap method to take the additional steps outlined above. All changes can be implemented within the ParallelContextSimple class, except for the selection of the context in the driver. However, you should not feel obliged to use this class and may implement concurrency anyway you see most appropriate, in any class. CourseNana.COM

Validate the correctness of your code by cross-checking the residual error for each iteration with the sequential code, e.g., using the CSR implementation provided in the feedback of assignment 1. A validation program will also be provided. CourseNana.COM

Domjudge: Problem "Q2", using data structure "'CHOOSE" (you may map this to any of CSR, CSC or COO, as you see fit) and algorithms "PR" and "CC" CourseNana.COM

Writeup: Report the execution time for your concurrent solution on the PageRank problem when orocessina the orkut undir araph and when usina a CourseNana.COM

CourseNana.COM

varying number of threads. Select 1, 2, 3, ... T+2 where T is the number of physical processing cores (including hyperthreads) of the computer that you are using for these measurements. Plot these measurements in a chart. Report the number of processor cores and hyperthreading arrangement on the computer that you are using. Discuss the achieved speedup you have obtained for PageRank and compare to the speedup predicted by Amdahl's Law. CourseNana.COM

Question 3: Disjoint Set Algorithm [15 marks] CourseNana.COM

Implement Jayanti and Tarjan's randomised concurrent disjoint set algorithm. You can build on top of the Relax interface and the associated methods you previously developed in the second assignment for visiting all edges of the graph using multiple threads. Your Relax class will have as data member a reference to the relevant data structure to store the parent of each vertex. Initially, a new set is created for each vertex (method makeSet). The relax method will record that the source and destination of this edge belong to the same set by calling the method union. CourseNana.COM

The code in the repository contains a partially defined class "DisjointSetCC" where you can add your code. CourseNana.COM

Domjudge: Problem "Q3", using data structure "ICHOOSE" and algorithm "DS". CourseNana.COM

Writeup: Paste the code of your DSSCRelax class in the writeup document using a pretty printer. Explain any design decisions of interest, at least your choice between no path compression, path splitting and path halving, and the motivation for this choice. Report the speedup of your parallel code (parallel iteration of edgemap has been achieved in Question 2 and should require no additional code) as a function of the number of threads using a chart. CourseNana.COM

Question 4: Pipelined Operation (Disjoint Set) [10 marks] CourseNana.COM

Implement pipeline operation specifically for the DisjointSetCC algorithm using the producer-consumer template where the producer thread reads edges from the file and the consumer threads call edgemap on those edges. The graph data structure need never be stored in memory as the DisjointSetCC algorithm makes only one pass using edgemap. Create a file "SparseMatrixPipelined" where the producer/consumer threads are set up in the edgemap method and select it in the main driver method. To simplify your code, use the sequential ParallelContext and implement all threading code in the edgemap function. You may use code from the practical material to implement the producer/consumer mechanism. CourseNana.COM

Implement your code such that a block of edges is passed as one unit between producer and consumer (minimum of one edge, but typically better performance if multiple edges are passed in a block). Typically a fixed-size block should be chosen (e.g., 128 edges) but the final block may have a different size. Note also that the buffer size is a configurable parameter that impacts performance (see practical material "6 Monitor Performance"). CourseNana.COM

Domjudge: Problem "Q4", using data structure "ICHOOSE" and algorithm "DS". CourseNana.COM

CourseNana.COM

Writeup: Explain which file format you read in (CSC, CSR or COO) and why. Record the speedup of your solution using a chart giving attention to the number of threads used, the block size and the buffer size, and discuss your findings. CourseNana.COM

Question 5: Linearisability and AIM [15 marks] CourseNana.COM

The following diagrams show the execution timelines of threads and the methods executed against a concurrent data structure. "s" is a disjoint set data structure where each element is initially placed in a singleton set. CourseNana.COM

1. Is the following execution linearizable? Is it sequentially consistent? If yes, show a consistent execution order; if not, argue why such an order does not exist. CourseNana.COM

s.union(x,y) s.sameSet(u,x): false CourseNana.COM

s.sameSet(x,y): true s.union(u,v) s.union(v,y) CourseNana.COM

2. In the following execution, the response for B : s.sameSet(x,z) is omitted. Which outcomes out of "false" and "true" would be possible for a linearizable execution? Justify your answers. CourseNana.COM

s.union(x,y) s.union(x,z) CourseNana.COM

A- - - - - - CourseNana.COM

s.union(y,z) s.sameSet(x,z): ? CourseNana.COM

B- _ - -> CourseNana.COM

3. Consider the Jayanti and Tarjan's concurrent randomised disjoint set data structure (slides 12_concurrentdisjointset.pptx, slide 6). Assuming CourseNana.COM

a concurrent execution of "union(x,y)" with index(x) < index(y) and "find(x)", explain how the Java Memory Model ensures that the value of parent(y) is visible to the find method. Identify relevant load and store operations, synchronisation operations and happens-before relationships between them. CourseNana.COM

Question 6: Competition — Connected Components [40 marks] CourseNana.COM

The goal of this question is to further increase the performance impact of the connected components problem. You will need to select first what algorithm you further want to fine-tune (ConnectedComponents vs DisjointSetCC). Both are viable approaches. We will list a number of suggestions on how you can improve their performance, but really any valid change to the program is acceptable as long as the correctness of the program is maintained. Some performance optimisations that are not too hard to pursue are listed below. This list is not exhaustive, you are free to let your imagination run wild! And don't forget about Amdahl's Law, nor the importance of keeping the disjoint set trees shallow. CourseNana.COM

Get in Touch with Our Experts

WeChat (微信)

Last: CS320 Concepts of Programming Languages Fall2023 - Project Part 1

Next: COMP3611 Machine Learning - Assessment: Gaussian Distribution, PCA and Predict Cancer Mortality Rates in US counties

concurrent programming代写,Connected components代写,Graph代写,PageRank代写,concurrent programming代编,Connected components代编,Graph代编,PageRank代编,concurrent programming代考,Connected components代考,Graph代考,PageRank代考,concurrent programminghelp,Connected componentshelp,Graphhelp,PageRankhelp,concurrent programming作业代写,Connected components作业代写,Graph作业代写,PageRank作业代写,concurrent programming编程代写,Connected components编程代写,Graph编程代写,PageRank编程代写,concurrent programmingprogramming help,Connected componentsprogramming help,Graphprogramming help,PageRankprogramming help,concurrent programmingassignment help,Connected componentsassignment help,Graphassignment help,PageRankassignment help,concurrent programmingsolution,Connected componentssolution,Graphsolution,PageRanksolution,