1. Homepage
  2. Programming
  3. CS7280 Special Topics in Database Management - Project 3: Big Data Analytics

CS7280 Special Topics in Database Management - Project 3: Big Data Analytics

Engage in a Conversation
NEUCS7280Special Topics in Database ManagementDatabasePySparkHadoopBig QueryMap Reduce

CS 7280 Special Topics in Database Management CourseNana.COM

Project 3: Big Data Analytics CourseNana.COM

Objectives: CourseNana.COM

  1. Understanding Hadoop Ecosystem and Data Analytics CourseNana.COM

  2. Become familiar with MapReduce programming and Spark CourseNana.COM

  3. Gain experience with research on big data and data analytics CourseNana.COM

Spring 2024 CourseNana.COM

This will be a group project (by 2 students) for one semester. The main purpose of this project is to become familiar with Big Data platform, including Hadoop system, MapReduce programming, and cloud based big data solutions (e.g., Google Big Query). You need to follow the instruction to conduct the project. CourseNana.COM

Phase 1 (15%): Selecting Data Set - Due: March 27, 2024 (Wed) CourseNana.COM

  • Each student researches on any data that you are interested in, and collect the information about the data. CourseNana.COM

  • Find any characteristics of the data you select, and describe why you are interested in CourseNana.COM

  • If possible, prepare 3~4 sample data, which can be either real data or manipulated one. CourseNana.COM

  • Make 2~ 3 pages of Powerpoint file as a report CourseNana.COM

  • Submit the PPT file to Canvas CourseNana.COM

    o PPT, PPTX or PDF file format ONLY
    Phase 2 (15%): Defining Problems
    – Due: April 3, 2024 (Wed) CourseNana.COM

  • In this 2nd phase, you are going to research on the following topics based on the data you selected in Phase 1: CourseNana.COM

    • -  What you can analyze using the selected data in terms of Hadoop HDFS with CourseNana.COM

      Spark, and Google Big Query using GCP. o 1 Spark CourseNana.COM

      o 1 Google Big Query using GCP CourseNana.COM

    • -  How you can collect the data at least 1GB. That means your data MUST be CourseNana.COM

      uploaded to HDFS using VM in Phase 4-5. CourseNana.COM

  • Make 2~ 3 pages of Powerpoint file as a report CourseNana.COM

  • Submit the PPT file to Canvas CourseNana.COM

    o PPT, PPTX or PDF file format ONLY
    Phase 3 (20%): Preparing Proposal
    – Due: April 3, 2024 (Wed) CourseNana.COM

  1. Preparing Data and Upload to HDFS. You can use variety of ways to prepare your data set including: CourseNana.COM

  2. You data set MUST have at least 100,000 instances (or rows) CourseNana.COM

  3. Upload your data set into HDFS (VM) CourseNana.COM

  4. Implement Spark or Big Query CourseNana.COM

- You can use PySpark or any Steaming with other program language such as Python. CourseNana.COM

o 1 Spark, or CourseNana.COM

o 1 Big Query CourseNana.COM

  1. Submit your source code to Canvas and download link for your data set CourseNana.COM

    • -  All source files should be compressed with TAR (e.g., tar cvf XXX.tar) on VM (JAR, TAR or ZIP file format ONLY) CourseNana.COM

    • -  For the dataset, you can upload it to Google Drive (or any Web hard) and then send a link when you submit your source CourseNana.COM

  2. Then, submit 10 minutes demo video to Canvas CourseNana.COM

CourseNana.COM

- Submit a link such as YouTube, or record your presentation using Canvas CourseNana.COM

Phase 5 (25%): Presentation of Project – Due: April 17, 2024 (Wed) before class. CourseNana.COM

  1. Writing-up (at least 4 pages with IEEE format). You must use IEEE format. o DOC, DOCX or PDF file format ONLY CourseNana.COM

  2. Poster (36 x 24 inches Powerpoint file). You can use one of templates provided on Canvas. CourseNana.COM

    o PPT, PPTX or PDF file format ONLY CourseNana.COM

  3. Submit your paper and poster to Canvas CourseNana.COM

  4. Make 8 ~ 10 pages of Powerpoint file and submit to Canvas CourseNana.COM

    o PPT, PPTX or PDF file format ONLY CourseNana.COM

  5. Then, prepare 8 minutes final presentation on April 27, 2022 (Wednesday) CourseNana.COM

Submission CourseNana.COM

You will submit your program using Canvas. If you have any trouble to use blackboard, you can contact TA or instructor. CourseNana.COM

Grading CourseNana.COM

15 Phase 1 15 Phase 2 20 Phase 3 25 Phase 4 25 Phase 5 CourseNana.COM

Bonus +20 for high quality writing-up that can be submitted to either conference or journal paper.
CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
NEU代写,CS7280代写,Special Topics in Database Management代写,Database代写,PySpark代写,Hadoop代写,Big Query代写,Map Reduce代写,NEU代编,CS7280代编,Special Topics in Database Management代编,Database代编,PySpark代编,Hadoop代编,Big Query代编,Map Reduce代编,NEU代考,CS7280代考,Special Topics in Database Management代考,Database代考,PySpark代考,Hadoop代考,Big Query代考,Map Reduce代考,NEUhelp,CS7280help,Special Topics in Database Managementhelp,Databasehelp,PySparkhelp,Hadoophelp,Big Queryhelp,Map Reducehelp,NEU作业代写,CS7280作业代写,Special Topics in Database Management作业代写,Database作业代写,PySpark作业代写,Hadoop作业代写,Big Query作业代写,Map Reduce作业代写,NEU编程代写,CS7280编程代写,Special Topics in Database Management编程代写,Database编程代写,PySpark编程代写,Hadoop编程代写,Big Query编程代写,Map Reduce编程代写,NEUprogramming help,CS7280programming help,Special Topics in Database Managementprogramming help,Databaseprogramming help,PySparkprogramming help,Hadoopprogramming help,Big Queryprogramming help,Map Reduceprogramming help,NEUassignment help,CS7280assignment help,Special Topics in Database Managementassignment help,Databaseassignment help,PySparkassignment help,Hadoopassignment help,Big Queryassignment help,Map Reduceassignment help,NEUsolution,CS7280solution,Special Topics in Database Managementsolution,Databasesolution,PySparksolution,Hadoopsolution,Big Querysolution,Map Reducesolution,