DTS205TC High Performance Computing
School of AI and Advanced Computing
Deadline: April 10th, 2023 @ 23:59 (UTC+8 Beijing) Percentage in final mark: 10% Maximum score: 20 marks Learning outcomes assessed: A, C, D
Late policy: 10% of the total marks available for the assessment shall be deducted from the assessment mark for each working day after the submission date, up to a maximum of five working days.
Overview
A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. 1) Implement at least 2 different types of hash functions, such as murmur, fnv, SHA1, MD5, and so on. and show 2 examples of each hash function. You can use existing libraries and packages, such as python hashlib. (10 marks) 2) Implement a Bloom Filter with the hash functions you implemented in task 1). A txt file will be provided to create the bloom filter and a test list will be provided to test whether the element in the test list is in the bloom filter or not. (5 marks) 3) Try different size of your bloom filter and use different number of hash functions to build your bloom filter. Compare the differences and analyze the speed and complexity of your bloom filter. (5 marks)