Homepage
Programming
CS 561 Data Systems Architectures - Project 0: Implementation of a Zone Map

CS 561 Data Systems Architectures - Project 0: Implementation of a Zone Map

Engage in a Conversation

CAS CS 561: Data Systems Architectures

CS561 Spring 2023 – Project 0 Title: Implementation of a Zone Map CourseNana.COM

Background: A zone map is a coarse index that maintains minimum/maximum value ranges of one or more specified columns over contiguous sets of data blocks or rows, called zones of a table [1]. A zone map helps in data pruning of both single keys and a range of keys. The queried key/range of keys is first checked with the min/max values of every block/zone before searching within the block. By avoiding unnecessary I/Os to the storage, zone maps improve query performance. CourseNana.COM

Objective: The objective of the project is to implement a simple zone map and evaluate its performance on both point and range queries. For range queries, you will need to implement the query generator that can issue range queries with specific selectivity. The workflow for this is as the following. CourseNana.COM

(a) Implement a zone map by cloning the API available at: https://github.com/BUDiSC/cs561_templatezonemaps2. This API contains a header file with basic functionality definitions for a zone map. You are free to modify certain components to improve performance. CourseNana.COM

(b) Implement the range query generator with selectivity s (0 ≤ # ≤ 1) as input in the existing workload generator (workload_generator.cpp). You can fix the number of elements to be selected as ⌊# ∗ '⌋ in your implementation. The existing workload generator generates three files: a raw data (to be input) file, a point query file, and an empty file. The empty file is supposed to store the generated range queries after you implement the range query generator. CourseNana.COM

(c) In main.cpp, write the code for a parser that parses the range query file. Report the average execution time for the following workloads. W1: 5M inserts with sorted data*, 10K point queries W2: 5M inserts with sorted data, 1K range queries, selectivity: 0.001 W3: 5M inserts with sorted data, 1K range queries, selectivity: 0.1 W4: W1 with unsorted data W5: W2 with unsorted data W6: W3 with unsorted data CourseNana.COM

*To generate sorted data just use the --sort flag while generating the data. CourseNana.COM

(d) Research questions. i) What is the expected memory footprint (in bytes) to build the zone map? (assume: number of elements is N and every zone has d entries) ii) In practice, we may have raw data stored on disk, and the zone map is maintained in memory to reduce the number of required I/Os to answer queries. What should we do if we only have a limited memory budget to build the zone map (i.e., when the memory budget of M bytes is smaller than the expected memory footprint)? CourseNana.COM

Deliverables: Zone map implementation code that runs the test cases. It is required to have comments within the implementation, that explain various design decisions. Do NOT upload your code to public repositories, such as GitHub and Bitbucket. CourseNana.COM

[1] M. Ziauddin, A. Witkowski, Y. J. Kim, D. Potapov, J. Lahorani, and M. Krishna. CourseNana.COM

Dimensions based data clustering and zone maps. PVLDB Endow. 10(12), pp. 1622–1633. DOI: https://doi.org/10.14778/3137765.3137769.

Get in Touch with Our Experts

WeChat (微信)

Last: Economic and Statistical Software: Introduction to R

Next: CSCI-561 Foundations of Artificial Intelligence Homework 1: Path planning and Search Algorithms

US代写,Boston University代写,CS561代写,CS 561代写,Data Systems Architectures代写,Implementation of a Zone Map代写,C++代写,US代编,Boston University代编,CS561代编,CS 561代编,Data Systems Architectures代编,Implementation of a Zone Map代编,C++代编,US代考,Boston University代考,CS561代考,CS 561代考,Data Systems Architectures代考,Implementation of a Zone Map代考,C++代考,UShelp,Boston Universityhelp,CS561help,CS 561help,Data Systems Architectureshelp,Implementation of a Zone Maphelp,C++help,US作业代写,Boston University作业代写,CS561作业代写,CS 561作业代写,Data Systems Architectures作业代写,Implementation of a Zone Map作业代写,C++作业代写,US编程代写,Boston University编程代写,CS561编程代写,CS 561编程代写,Data Systems Architectures编程代写,Implementation of a Zone Map编程代写,C++编程代写,USprogramming help,Boston Universityprogramming help,CS561programming help,CS 561programming help,Data Systems Architecturesprogramming help,Implementation of a Zone Mapprogramming help,C++programming help,USassignment help,Boston Universityassignment help,CS561assignment help,CS 561assignment help,Data Systems Architecturesassignment help,Implementation of a Zone Mapassignment help,C++assignment help,USsolution,Boston Universitysolution,CS561solution,CS 561solution,Data Systems Architecturessolution,Implementation of a Zone Mapsolution,C++solution,