1. Homepage
  2. Homework
  3. [2020] ISYS3412 Practical Database Concepts - Assignment4 - Database Design Project
This question has been solved

[2020] ISYS3412 Practical Database Concepts - Assignment4 - Database Design Project

Engage in a Conversation
Rmit UniversityISYS3412Practical Database ConceptsSQL

ISYS3412 Practical Database Concepts CourseNana.COM

Assessment 4: Database Design Project CourseNana.COM

  CourseNana.COM

Overview CourseNana.COM

This is a practical and real-world project that puts the knowledge you gained into practice. You are required to investigate and understand a publicly available dataset, design a conceptual model for storing the dataset in a relational database, apply normalisation techniques to improve the model, build the database according to your design and import the data into your database, and develop SQL queries in response to a set of requirements. CourseNana.COM

The objective of this assignment is to reinforce what you have learned in the whole course. Specifically, it involves how to build a simple application that connects to a database backend, running a simple relational schema. CourseNana.COM

Part A: Understanding the Data (0 Marks, Preliminary Work) CourseNana.COM

Part B: Designing the Database (10%) CourseNana.COM

Part C: Creating the Database and Importing Data (10%) CourseNana.COM

Part D: Data Retrieval and Visualisation(15%) CourseNana.COM

  CourseNana.COM

Assessment details CourseNana.COM

Part A: Understanding the Data CourseNana.COM

In this assignment, we are working with the publicly available dataset: A Global Database of COVID-19 Vaccinations. Further details about this dataset are available in the article available through the following URL: https://www.nature.com/articles/s41562-021-01122-8. The abstract of the article is as follows. CourseNana.COM

An effective rollout of vaccinations against COVID-19 offers the most promising prospect of bringing the pandemic to an end. We present the Our World in Data COVID-19 vaccination dataset, a global public dataset that tracks the scale and rate of the vaccine rollout across the world. This dataset is updated regularly and includes data on the total number of vaccinations administered, first and second doses administered, daily vaccination rates and population-adjusted coverage for all countries for which data are available (169 countries as of 7 April 2021). It will be maintained as the global vaccination campaign continues to progress. This resource aids policymakers and researchers in understanding the rate of current and potential vaccine rollout; the interactions with non-vaccination policy responses; the potential impact of vaccinations on pandemic outcomes such as transmission, morbidity and mortality; and global inequalities in vaccine access. CourseNana.COM

A live version of the vaccination dataset and documentation are available in a public GitHub repository at https://github.com/owid/covid-19-data/tree/master/public/data/vaccinations. These data can be downloaded in CSV and JSON formats. CourseNana.COM

  CourseNana.COM

For the purposes of completing this assignment, we are only using the following files. You are required to review and analyse the dataset available in these files. You will find that reviewing the rest of the files, even if not listed below, will help you to form a better understanding about the big picture. CourseNana.COM

  CourseNana.COM

FILE NAME CourseNana.COM

DESCRIPTION CourseNana.COM

1 CourseNana.COM

locations.csv CourseNana.COM

Country names and the type of vaccines administered. Each line represents the last observation in a specific country. Refer to README.md for the details. CourseNana.COM

2 CourseNana.COM

us_state_vaccinations.csv CourseNana.COM

History of observations for various locations in the US. CourseNana.COM

3 CourseNana.COM

vaccinations-by-age-group.csv CourseNana.COM

History of observations for vaccinations of various age groups in each country. CourseNana.COM

4 CourseNana.COM

vaccinations-by-manufacturer.csv CourseNana.COM

History of observations for various types of vaccines used in each country. CourseNana.COM

5 CourseNana.COM

vaccinations.csv CourseNana.COM

Country-by-country data on global COVID-19 vaccinations. Each line represents an observation date. Refer to README.md for the details. CourseNana.COM

6 CourseNana.COM

country_data/Australia.csv CourseNana.COM

Daily observations of vaccination in Australia. CourseNana.COM

7 CourseNana.COM

country_data/United States.csv CourseNana.COM

Daily observations of vaccination in the US. CourseNana.COM

8 CourseNana.COM

country_data/England.csv CourseNana.COM

Daily observations of vaccination in England. CourseNana.COM

9 CourseNana.COM

country_data/China.csv CourseNana.COM

Daily observations of vaccination in China. CourseNana.COM

Table 1: List of data files CourseNana.COM

  CourseNana.COM

To complete the tasks in the following sections, you are required to review and analyse the dataset that is available in the named files. CourseNana.COM

Part B: Designing the Database (10%) CourseNana.COM

Task B.1 Produce an ER diagram for a relational database that will be able to store the given dataset. CourseNana.COM

It is important to note that the given CSV files are not necessarily representing a good design for a relational database. It is your task to design a database that will adhere to good design principles that were taught throughout the course. This means your database schema will not match the structure of the CSV files and, therefore, you will require to manipulate the structure of the dataset (and not the data itself) to import it into your database. Importing the data is required to complete Task C.2. CourseNana.COM

The ER diagram must be produced by Lucidchart similar to the exercises that were completed in in the course. UML notation is expected and using other notations will not be acceptable. Including a high-quality image representing your model is important, which can be achieved using Export function of Lucidchart. CourseNana.COM

You are also required to transform the ER diagram into a database schema that will be used in the next part of the assignment. CourseNana.COM

Creating a good database design typically involves some database normalisation activities. You should document your normalisation activities and support them with good reasoning. This typically involves explaining what the initial design was, what the problem was, and what changes have been made to rectify the issue. CourseNana.COM

The expected outcome of completing this task is one PDF file named model.pdf containing the following sections. CourseNana.COM

1. Database ER diagram and, if needed, a reasonable set of assumptions. CourseNana.COM

2. Explanation of normalisation challenges and the resulting changes. CourseNana.COM

3. Database schema. CourseNana.COM

Part C: Creating the Database and Importing Data (10%) CourseNana.COM

Task C.1 Produce one SQL script file named database.sql. This script file requires all the SQL statements necessary to create all the database relations and their corresponding integrity constraints as per your proposed design in Part B. The script file must run without any errors in Page 5 of 8 CourseNana.COM

SQLite Studio and contain necessary commenting to separate various relations. Note that this script is not supposed to store any data into the relations. CourseNana.COM

The expected outcome of completing this task is one script file with the specific name of database.sql. CourseNana.COM

Task C.2 Create a database file named Vaccinations.db. Import the given dataset into your database. CourseNana.COM

To complete this task, you may need to change the format of the CSV files to match the attributes of your designed database. You can use a spreadsheet editor such as Microsoft Excel. CourseNana.COM

The next step is to import the spreadsheets into the database you create in SQLite Studio. To complete this task, use the menu option Tools – Import in SQLite. CourseNana.COM

The expected outcome of completing this task is one database file named Vaccinations.db, which must contain all the data that is stored in the CSV files named in Table 1. CourseNana.COM

Part D: Data Retrieval and Visualisation (15%) CourseNana.COM

Now that you have created and populated a database, it is time to create some queries to investigate the data in various ways. In addition to writing the required queries, you are also asked to produce data visualisation for the results of your queries. CourseNana.COM

The tasks in this section represent the queries that must be supported. Each query must consist of one SQL statement. It would be acceptable to use several nested queries, combine several SELECT statements with various operators etc. However, it would not be acceptable to have multiple and separated queries for each task. CourseNana.COM

After you have written each query, you are expected to produce a data visualisation for each result set. You have the freedom to choose the tool for creating your visuals (e.g., Excel, Google Charts, Tableau) as well as the visualisation techniques (e.g., charts, plots, diagrams, maps). Completing this portion of the work will require that you understand the nature of the results of each query, undertake research to choose a visualisation tool you are comfortable with, decide about the best technique to visually represent each result set, and produce the visualisation. Answers to tasks in Part D that are not supported by a visualisation can achieve up to 80% of the grade associated with each task. CourseNana.COM

The expected outcome of completing this task is as follows. CourseNana.COM

1. One SQL script file named Queries.sql containing all the queries developed for the tasks in this section. It is important that you add comment lines to separate the queries and indicate which task they belong to. Note that valid SQL comments must not generate errors in SQLite Studio. The marker of your work will use this file to execute and test your queries. CourseNana.COM

1. A PDF file named QuerieResults.pdf containing the following elements for each task. CourseNana.COM

a. The SQL query CourseNana.COM

  CourseNana.COM

  CourseNana.COM

b. a snapshot of the first 10 results of your query. The snapshot must also show the total number of results retrieved by the query. A sample snapshot is provided below for your reference. CourseNana.COM

Figure 1: Sample results snapshot with total rows CourseNana.COM

c. Data visualisation CourseNana.COM

  CourseNana.COM

Task D.1 For any two given dates (i.e., you can assume any two dates, e.g., 1 April and 3 April), list the dates, the total number of vaccines administered in each observation date in each of all countries, and the difference between the administered vaccines. Each row in the result set must have the following structure. (Note: OD2 is after OD1) CourseNana.COM

  CourseNana.COM

List of Tasks CourseNana.COM

Observation Date 1 (OD1) CourseNana.COM

Country Name (CN) CourseNana.COM

Administered Vaccine on OD1 (VOD1) CourseNana.COM

Observation Date 2 (OD2) CourseNana.COM

Administered Vaccine on OD2 (VOD2) CourseNana.COM

Difference of totals (VOD1-VOD2) CourseNana.COM

  CourseNana.COM

Figure 2: Column Headers in the Result Set for Task D.1 CourseNana.COM

  CourseNana.COM

Task D.2 Find the countries with the cumulative numbers of COVID-19 doses administered by each bigger than the average doses administered by all countries. Produces a result set containing the name of each country and the cumulative number of doses administered in that country. Each row in the result set must have the following structure. CourseNana.COM

  CourseNana.COM

Task D.3 Produce a list of 10 countries with the biggest numbers of vaccine types, with the type of vaccines (e.g., Oxford/AstraZeneca, Pfizer/BioNTech) administered in each country. For a country that has administered several types of vaccine, the result set is required to show several tuples reporting each type of vaccine in a separate tuple. Each row in the result set must have the following structure CourseNana.COM

  CourseNana.COM

Task D.4 There are different data sources used to produce the dataset. Produce a report showing the total number of vaccines administered according to each data source (i.e., each unique URL). Order the result set by source name and URL. Each row in the result set must have the following structure. CourseNana.COM

  CourseNana.COM

Task D.5 How does various countries compare in the speed of their vaccine administration? Produce a report that lists all the observation dates in 2022 and, for each date, list the total number of people fully vaccinated in each one of the 4 countries used in this assignment. CourseNana.COM

[Date, Australia, United States, England, China] CourseNana.COM

  CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
Rmit University代写,ISYS3412代写,Practical Database Concepts代写,SQL代写,Rmit University代编,ISYS3412代编,Practical Database Concepts代编,SQL代编,Rmit University代考,ISYS3412代考,Practical Database Concepts代考,SQL代考,Rmit Universityhelp,ISYS3412help,Practical Database Conceptshelp,SQLhelp,Rmit University作业代写,ISYS3412作业代写,Practical Database Concepts作业代写,SQL作业代写,Rmit University编程代写,ISYS3412编程代写,Practical Database Concepts编程代写,SQL编程代写,Rmit Universityprogramming help,ISYS3412programming help,Practical Database Conceptsprogramming help,SQLprogramming help,Rmit Universityassignment help,ISYS3412assignment help,Practical Database Conceptsassignment help,SQLassignment help,Rmit Universitysolution,ISYS3412solution,Practical Database Conceptssolution,SQLsolution,