Programming Project 1
JIT - Building IT Systems November 2022
Assignment 2: Our World (Weight: 30%
This task requires you to use several of the technologies introduced in this unit: programming (Python), database queries (SQLite) and web development (HTML).
This task involves writing a Python program which interrogates an SQLite database and generates HTML documents as its output. It is thus an excellent example of the main theme of this unit, i.e., the way different “computer languages” can be used together to build new IT systems.
The aim of this task is to develop a program that creates HTML documents to provide a visual representation of world population statistics according to data stored in a database. More specifically: given a country, show all the cities in that country that have a population of at least a certain value (threshold).
For example, the screenshot in Figure 1 shows the HTML document produced to represent the cities in Australia with a population of at least 100,000.
Figure 1: HTML output file If you browse the data in the database using the DB Browser SQLite, you will see that there are 14 Australian cities included in the database.
Querying the database in the DB Browser SQLite for cities in Australian with a population >= 100,000 returns 13 rows as shown in Figure 2.
Notice that those 13 cities in the results set are displayed in the HTML page generated in Figure 1.
Information about the query is displayed at the top of the HTML (eg the country name, the threshold and the number of cities with a population of at least that threshold.
The names of the cities are displayed in the body of the page, and cities with a larger population are displayed in a bigger font size - making a comparison of population sizes immediately obvious to the reader.
The supplied Python template file includes a number of tests that provide the country name, threshold and HTML filename prefix. For example, Test 2 was used to produce the output in Figure 1: Figure 2: SQL query - Australian Cities >= 100,000 Test 2: Aus cities > 100000
show_population(['Australia'], 100000, 'Test02') Notice that the first parameter is a list of string, with each string representing one country. So your program must accept a list of (one or more) countries, and produce a set of HTML pages: one per country. For example, see Test 5 which contains as its first parameter a list of five countries.
Test 5 - 4 countries
show_population(['Germany', 'New Zealand', 'Austria', 'Australia'], 1000000, 'Test05') In this case, five HTML pages must be generated – each containing a visual representation of the country's cities and those cities' population as described above.
An SQLite database file named world.db containing the data to use for this task has been supplied. (This database is a sample database acquired for teaching purposes only, and is not meant to represent a complete or up-to-date set of values.) There are three tables in this database: City, Country and CountryLanguage. Among them, two tables are needed for this task, City and Country. The definition of these two tables is explained below.
The City table provides information about each city in the database. See Figure 3. Each row in the table consists of five fields. Only three of the fields in this table are of interest to us in this assessment task: Name, the full name of the city; CountryCode, a code representing the country it belongs to; and Population, the number of people living in this city. These (and other) fields are defined by the schema below: ID numeric, Name text, CountryCode text, District text, Population numeric, PRIMARY KEY (ID)
Figure 3: City Table
The Country table provides information about the countries those cities belong to. See Figure
- Each row in the table consists of many fields, only two of which are of interest to us in this assessment task: Code, a unique identifier for the country (the primary key in this table), and Name, the full name of the country. These (and some of the other) fields are defined by the schema below: Code Text, Name Text, Continent Text, Region Text, SurfaceArea Numeric, … PRIMARY KEY (Code)
Figure 4: Country Table Importantly, note that the two tables have one field in common: Code in the Country table, and CountryCode in the City table. (Hint: you will need to join the tables to produce all of the required information from your SQLite query.)
Given a list of countries, we want to know the population of each country's cities that are at or above a certain threshold. The required output is a set of HTML pages, one per country. Each HTML page must contain a visual representation of the country's city populations. Cities with a larger population must be displayed in a bigger font. You are required to develop a Python program that accesses the supplied SQLite database described above to generate HTML documents. The calling code to your program specifies:
- one or more countries (supplied as a list);
- a population threshold (supplied as an integer value); and
- the output HTML filename prefix to be used (supplied as a string).
For example, the following tests have been included in the template, each showing a call to the function show_population, with three arguments supplied (a list of one or more strings representing the countries, a number representing the population threshold, and a string representing the output HTML filename prefix): """ Test 1: China cities > 1m
show_population(['China'], 1000000, 'Test01')
Test 2: Aus cities > 100000
show_population(['Australia'], 100000, 'Test02') Test 3 - 2 countries show_population(['China', 'Australia'], 1000000, 'Test03') # Test 4 - 3 countries show_population(['China', 'India', 'United States'], 10000, 'Test04') Test 5 - 4 countries show_population(['Germany', 1000000, 'Test05')
Test 6 - 6 countries
show_population(['Indonesia', 'Japan', 'Thailand', 'Taiwan', 'Ireland', 'United Kingdom'], 1000, 'Test06') Test 7 - empty results set show_population(['Australia'], 5000000, 'Test07') """
The result of each test is a collection of one or more HTML documents, one per country, showing all cities in that country with a population at or above the given threshold. The name of each HTML document should reflect the test used to produce it, including the prefix supplied (eg "Test06") followed by the country name (eg "Indonesia"). For Test06, note that there will be six HTML documents created, each named with the prefix followed by the country name.
For example the HTML documents generated from Test 3 should be similar to those shown in Figure 5 and Figure 6.
Figure 5: Test 3 Output file: "Test03_China.html"
Figure 6: Test 3 Output file: "Test03_Australia.html" The idea is that the more populated cities are displayed in larger fonts. For instance, Shanghai is more populated than Peking, while Zibo is considerably less populated. In Figure 3, the font used for Shanghai is obviously larger than that of Peking and Zibo. You should also use different colours to display each city. Although not a requirement, it would be useful to display similarly sized cities in the same colour. In any event, and in each case in the figures above, the relative populations of the cities in each of the specific countries is shown clearly by the font sizes. Another feature of the documents shown in Figure 5 and Figure 6 is that on the top of the page you must display the country name, e.g., "Cities of China", and the number of cities in this country, e.g., 35 in this case. In addition, your program must hyperlink each of the pages together in the order of the list of countries supplied, to make it easy for the user to search through them. In Figure 5 and Figure 6 above we can see hyperlinks at the bottom of the page to the previous or next categories in the list. However, no ‘previous’ link is produced for the first category in the country list and no ‘next’ link is produced for the last page in a list. For example, if the given list of countries is ["Australia", "China", "India"] then your function must generate three HTML pages, one for Australia, one for China, and one for India. In the China page, the ‘previous’ link should take us to the Australia page and the ‘next’ link should take us to the India page. However, for the Australia page, there is no ‘previous’ link, and for the India page there is no ‘next’ link.
Develop a Python program which generates such pages using the data in the SQLite database world.db. As illustrated in the Python template file accompanying these instructions, you must define a function called show_population which, given a list of countries, a population threshold and a title, produces HTML files that can be displayed in a web browser, one per country. As the database changes from time to time (to include updates of world population figures etc.) your program must work for any given database of the same structure. That means, you may not hard-wire your solution to produce the results expected with this particular set of data. The HTML pages produced by your program must be a result of your program querying the database and producing the HTML content according to the results returned from the database query. The specific things you must do to complete this portfolio task are as follows. NB: This is a large task. Although it is possible to complete the task using only one Python function, show_population, we recommend that you break your solution down into several functions to make your program code manageable and readable. You must use the supplied template file OurWorld_template.py as a starting point.
- Write a Python function named show_population that has three parameters: (a) a list of strings representing one or more countries, (b) an integer value representing a population threshold, and (c) a string representing the name for this particular multi-page document. For example, if we execute the function call show_population(['China', 'Test04')
the first parameter (a list of three values) indicates that we want three pages created, one each for China, India and the United States. Each of these pages must show the cities for the particular country, whose population is equal to or above the threshold of 10000 (the second parameter). The third parameter is simply a name used to uniquely identify this collection of HTML pages. You should use it together with the country name to create the filename for the each HTML file. In this particular example your program must produce the following three HTML files: Test04_China.html Test04_India.html Test04_United States.html Cities must be displayed in alphabetical order regardless of their population. At the bottom of each page, you must have hyperlinks to the previous page and next page in the list, if any. The font size used for each city must reflect its relative population as described above.
- Some queries will produce no results at all. In this case you should still produce an HTML page, even if there are no cities listed. An example is showed in Figure 7 below (which is one of the pages produced by Test 7). Do not change the supplied tests. However, you may of course include additional tests.
Figure 7: A page with no results returned
Before you can begin this task you must ensure that you have access to SQLite DB Browser so that you can work with the supplied database. You must also have access to the necessary SQLite-Python module so that you can call SQLite functions from Python code.
Given a country, you will need to execute a query on the database, to get the list of cities with a population at least as high as the threshold, and that population figure. The query involves a join of both the country and city tables. It is recommended that you debug your database queries in the DB Browser before attempting to incorporate them into your Python program.
Use the screenshots shown above to guide your solution, but you do not need to duplicate the precise choices of font sizes and colours. Importantly, however, your SQL query must return exactly the same cities and display their relative populations in a fashion similar to the examples above.
This is a large program, so it is suggested that you tackle it incrementally. Before you even begin coding you should use the DB Browser to determine exactly what query your Python program needs to execute. Then develop code to generate just one HTML page. Finally, write the code to iterate over a list of countries to generate all documents.
You need to work out how to define the font size and colour for the text to be displayed on web pages. This can be done by changing the font’s ‘style’. Refer to the lecture demonstrations and workshop exercises for how to do this.
Deliverables You should develop your solution by completing and submitting a single .py file containing:
- a function show_population which has all the characteristics defined above. When executed in the presence of an appropriately configured SQLite database, this program must produce the required HTML files.
- the completed “statement” at the beginning of the Python file to confirm that this is your own individual work including your name and student number in the places indicated. I will assume that a submission without a completed statement is not your own work and will not mark it. Do not submit any SQL or HTML files. Apart from working correctly, your program code must be well-presented and easy to understand, thanks to (sparse) commenting that explains the purpose of significant code segments and helpful choices of variable and function names. Professional presentation of your code will be taken into account when marking this task. If you are unable to solve the whole problem, submit whatever parts you can get working. You will receive partial marks for incomplete solutions.