1. Homepage
  2. Programming
  3. CMPUT 201 Practical Programming Methodology - Assignment 2: Rabin-Karp algorithm

CMPUT 201 Practical Programming Methodology - Assignment 2: Rabin-Karp algorithm

Engage in a Conversation
CanadaUniversity of AlbertaCMPUT 201Practical Programming MethodologyRabin-Karp algorithmword-search puzzleC

Problem Description CourseNana.COM

Bob and Alice are competitive word-search puzzle solvers. However, even to the best of Bob's abilities, he was never able to find words in the puzzles faster than Alice. As Bob's friend and amazing programmer, you have decided to help him out by creating a solver for word-search puzzles. CourseNana.COM

A word-search puzzle is solved when, given a grid of random letters and a set of words, the program finds the location of the input words in the grid (if they exist). The orientation of the words can be horizontal, vertical, left diagonal, or right diagonal. The words can be read left to right, right to left, top to bottom (whether vertically or diagonally), and bottom to up (whether vertically or diagonally). Note that a diagonal can start at any index, not necessarily on a central diagonal. CourseNana.COM

You have decided to approach the problem using the Rabin-Karp algorithm. The Rabin-Karp algorithm uses what is called a hash function to find a substring of size x in a string of size y >= x. CourseNana.COM

Algorithm Description CourseNana.COM

You may wonder, why use hash functions at all? Why don't we just use brute-force search? We certainly could do that, but it would be highly inefficient. If you don't know anything algorithm complexity, and symbols like O(n), O(logn) or O(1) sound foreign to you, review this article: Complexity of Algorithms. So, as you can imagine, using a special algorithm for string matching makes the process much, much faster. CourseNana.COM

bash tests/github_actions/github_actions.sh CourseNana.COM

Let's assume that we're searching for a word "dog" in a string "gadogado". We need to compare "dog" with any substring of "gadogado" that has the same length ("gad", "ado", "dog", "oga", "gad", "ado"). So, in a sense, you are "rolling" the word "dog" over the string "gadogado". To compare the word with the substrings, instead of comparing them character by character (which would be very time consuming!), wouldn't it be nice if we had some fixed value that represents each substring? Then, we can just quickly compare these values! Here comes the idea of a hash function. CourseNana.COM

What is a hash function? It's a function that can be used to map data of arbitrary size to fixed-size values. For example, a hash function can map strings to integers. Probably, one of the simplest hash functions would be the sum of ASCII codes of the characters of the string: CourseNana.COM

s = "gadogado" CourseNana.COM

H(s[0..3]) = H("gad") = 103 + 97 + 100 = 300 H(s[1..4]) = H("ado") = 97 + 100 + 111 = 311 H(s[2..5]) = H("dog") = 100 + 111 + 103 = 314 CourseNana.COM

However, it's not the best hash function because it allows many collisions when different strings have equal hash values. For example: CourseNana.COM

So we need to have a better hash function, that can take into account the order and position of letters. Here is how it works in the Rabin-Karp algorithm. CourseNana.COM

Study this resource to understand the algorithm.
Let's see an example of how to find the word dog in the string gadogado using the Rabin-Karp algorithm.
CourseNana.COM

  1. We need to know the size of our alphabet -- D. We use ASCII codes, so D = 256.
  2. Also, we need to choose a large enough prime number. The higher it is, the less collisions we will see. As a rule of thumb, it should be larger than the alphabet size. For this example, we'll use P = 577 (in your assignment, you'll use a different value, see below).
  3. Let's calculate the value of h. It's a rehashing coefficient, and we'll need it later, on Step 7. h = D^(len- 1) % P (len is the length of the word we're searching ("dog"), so len=3) In this example, h = 335.

4. Now, let's calculate the hash of the word "dog", letter by letter, using this step-by-step formula: CourseNana.COM

H("dog") = 100 + 111 + 103 = 314 H("axa") = 97 + 120 + 97 = 314 H("esb") = 101 + 115 + 98 = 314 CourseNana.COM

hash = (D * <previous_hash> + <current letter>) % P; CourseNana.COM

  1. Let's now loop through the string "gadogado", calculating "rolling" hashes of all three-letter substrings of it: "gad", "ado", ... .
  2. First, let's calculate the hash of the substring "gad". Applying the same formula as above, we get the result: hash of "gad" is 6.
  3. Now, we start "rolling" through the string "gadogado", or "rehashing" -- basically, moving to the right. Moving from "gad" to "ado" means that we need to "subtract" the letter "g" and "add" the letter "o". In the modular arithmetic of the Rabin-Karp algorithm, it's done the following way:

Thus, hash of "ado" is 506 (if the result if negative, just subtract it from P). Rolling one more step to the right, we can get from "ado" to "dog". Applying the same formula, we get the hash of "dog": 280! CourseNana.COM

  1. We see that we have found the hash we were looking for. So, with some probability, we have found the word we're looking for. But beware of collisions! Even with large enough P, collisions can happen. So, the final step.
  2. To be certain that we found the right substring, we need to compare the word with the found substring character by character.

IMPORTANT: For the purposes of this assignment, you must fix P (the prime number) to 7079, and D (alphabet size) to 256. CourseNana.COM

Task CourseNana.COM

Write a program that uses the Rabin-Karp algorithm to find the location of individual words in a given word- search grid and outputs their starting position and direction, if they exist. CourseNana.COM

Enumerate the directions as follows: CourseNana.COM

1. horizontal right (→) 2. horizontal left (←) 3. vertical down (↓)
4. vertical up (↑)
CourseNana.COM

hash = (D * (<previous_hash> - <letter to subtract> * h) + <letter to add>) % 577 CourseNana.COM

5. top left to bottom right diagonal ()
6. bottom right backwards to top left diagonal (
) 7. bottom left to top right (↗)
8. top right backwards to bottom left (↙)
CourseNana.COM

How to run your program Your program runs as follows: CourseNana.COM

where: CourseNana.COM

puzzle_file is a required argument. It is preceded by the -p flag. It is a text input file that contains the puzzle grid. The provided puzzle file will have n lines of single strings of length n, providing an n by n square grid of letters. Each line in the file represents a row, and each line will be of same size n. The word_length is a required argument. It is preceded by the -l flag. It is a number in the range [1, n] that represents the fixed length of the words the program will search for.
wordlist_file is a required argument. It is preceded by the -w flag. It is a text input file that contains the list of words to look for. Each word will be provided on a separate line and will have exactly word_length characters; otherwise, the wordlist file is invalid. The wordlist file can contain any ASCII character with values [32, 126], not necessarily English letters only.
If the input word is not found in the puzzle, the output would be
word;(0,0);0. Note that the top left corner of the puzzle corresponds to coordinate (0,0), but since the direction is 0, this indicates that the word is not found. Note that you must output the solution of each input word in the same order they were provided in the wordlist_file. CourseNana.COM

You must account for any edge cases for command-line arguments. We may run with too many arguments, not enough arguments, invalid arguments, etc. Also, note that, similar to assignment 1, the order of the arguments is not fixed. For example, ./wordSearch2D -w wordList.txt -p puzzleFile.txt -l 20 is valid. CourseNana.COM

Error Checking CourseNana.COM

./wordSearch2D -p <puzzle_file> -l <word_length> -w <wordlist_file> [-o <solution_file>] CourseNana.COM

Your program must report and print an error to stderr and quit in the following cases. Note the return code that needs to be returned in each case. The return code is the integer that your program returns upon exit (e.g., return 0; in main or exit(2) from anywhere in the program). The error message should be the exact error message provided: CourseNana.COM

If the input puzzle_file is not found (angle brackets are placeholders here, replace with appropriate file name) CourseNana.COM

Assumptions CourseNana.COM

Only characters with ASCII values between 32 and 126 will appear in the puzzle and word files
a is not the same as A. This is already the case, given their ASCII codes
The input word file can have any number of words, but each line will contain a single word. Each word will have a maximum size of 100 characters
A word exists at most once in the puzzle
CourseNana.COM

Error: Puzzle file <name-of-input-file> does not exist CourseNana.COM

Error: Wordlist file <name-of-input-file> does not exist CourseNana.COM

Usage: ./wordSearch2D -p <puzzle_file> -l <word_length> -w <wordlist_file> [-o <solution_file>] CourseNana.COM

Example with Expected Output Here is a full example.
Given the following 5*5 puzzle file,
puzzle.txt: CourseNana.COM

batlw CourseNana.COM

hello CourseNana.COM

gblfk CourseNana.COM

cikkl CourseNana.COM

jdgod CourseNana.COM

then running the program with CourseNana.COM

would write the following to a file called mysolution.com.png: CourseNana.COM

./wordSearch2D -l 3 -w wordlist.txt -p puzzle.txt -o mysolution.com.png CourseNana.COM

wok;(0,4);3 CourseNana.COM

dog;(4,4);2 CourseNana.COM

bat;(0,0);1 CourseNana.COM

elk;(1,1);5 CourseNana.COM

xxx;(0,0);0 CourseNana.COM

Note that the directions correspond to the enumerated list provided above. CourseNana.COM

Program Organization
Your program must have the following files CourseNana.COM

Only your main function and any helper methods for reading arguments should be in a file called CourseNana.COM

src/wordSearch2D.c
All functionality related to solving the word puzzle should be in file src/puzzle2D.c and its corresponding header file src/puzzle2D.h. You must divide the logic of solving the word puzzle into multiple functions.
Feel free to add additional files and functions to further divide the functionality in your program.
CourseNana.COM

You must not use any global variables! You can define macros for predefined constants such as the maximum puzzle size or the prime number, but you cannot use any global variables. CourseNana.COM

All .c and .h files must be in ./src. Your Makefile must be at the project root and the binary it produces (wordSearch2D) is also at the root. CourseNana.COM

Refer to the repository-layout to know how your repository should look (though you can always add MORE files). Use the following command on our lab machines to see what your repository structure: CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
Canada代写,University of Alberta代写,CMPUT 201代写,Practical Programming Methodology代写,Rabin-Karp algorithm代写,word-search puzzle代写,C代写,Canada代编,University of Alberta代编,CMPUT 201代编,Practical Programming Methodology代编,Rabin-Karp algorithm代编,word-search puzzle代编,C代编,Canada代考,University of Alberta代考,CMPUT 201代考,Practical Programming Methodology代考,Rabin-Karp algorithm代考,word-search puzzle代考,C代考,Canadahelp,University of Albertahelp,CMPUT 201help,Practical Programming Methodologyhelp,Rabin-Karp algorithmhelp,word-search puzzlehelp,Chelp,Canada作业代写,University of Alberta作业代写,CMPUT 201作业代写,Practical Programming Methodology作业代写,Rabin-Karp algorithm作业代写,word-search puzzle作业代写,C作业代写,Canada编程代写,University of Alberta编程代写,CMPUT 201编程代写,Practical Programming Methodology编程代写,Rabin-Karp algorithm编程代写,word-search puzzle编程代写,C编程代写,Canadaprogramming help,University of Albertaprogramming help,CMPUT 201programming help,Practical Programming Methodologyprogramming help,Rabin-Karp algorithmprogramming help,word-search puzzleprogramming help,Cprogramming help,Canadaassignment help,University of Albertaassignment help,CMPUT 201assignment help,Practical Programming Methodologyassignment help,Rabin-Karp algorithmassignment help,word-search puzzleassignment help,Cassignment help,Canadasolution,University of Albertasolution,CMPUT 201solution,Practical Programming Methodologysolution,Rabin-Karp algorithmsolution,word-search puzzlesolution,Csolution,