CITS1401 Computational Thinking with Python Project 2 Semester 1 2022
Project 2:
You should construct a Python 3 program containing your solution to the following problem and submit your program electronically on Moodle. The name of the file containing your code should be your student ID e.g. 12345678.py. No other method of submission is allowed. Your program will be automatically run on Moodle for sample test cases provided in the project sheet if you click the “check” link. However, your submission will be tested thoroughly for grading purposes after the due date. Remember you need to submit the program as a single file and copy-paste the same program in the provided text box. You have only one attempt to submit so don’t submit if you are not satisfied with your attempt. All open submissions at the time of the deadline will be automatically submitted. There is no way in the system to open the closed submission and reverse your submission.
You are expected to have read and understood the University's guidelines on academic conduct. Following this policy, you may discuss with other students the general principles required to understand this project, but the work you submit must be the result of your own effort. Plagiarism detection, and other systems for detecting potential malpractice, will therefore be used. Besides, if what you submit is not your own work then you will have learned little and will, therefore, likely, fail the final exam.
You must submit your project before the submission deadline listed above. Following UWA policy, a late penalty of 5% will be deducted for each day (24 hours), after the deadline, that the assignment is submitted. No submissions will be allowed after 7 days following the deadline except approved special consideration cases.
Overview
Congratulations!! The researchers at UWA were very impressed by your 3D facial analysis skills in Project-1. They have decided to seek your help in another exciting project. They analyse 3D distances between significant facial landmarks to perform face recognition. They also analyse the relationship of these distances with certain syndromes, like Autism. For example, last month they published a research paper which found that the parents of autistic children had larger 3D facial distances compared to normal adult population. You can find out more about their work here and here.
The researchers now want to analyse ten 3D Euclidean distances between 15 significant facial landmarks. These distances on one face can then be used to calculate similarity with other faces in the data set to see which faces are closer to (or look like) the reference face. Table 1 provides the details of each landmark while Figure 1 shows their location on the face. Table-2 gives you the details of the distances to be calculated. Remember these distances are between the landmarks mentioned in Table-1.
In this project, you are required to write a computer program that can read the data from a CSV (comma separated values) file provided to you. The file contains the 3D coordinates in X, Y and Z axes for the 15 facial landmarks mentioned in Table 1 for each adult. For simplicity, the landmark abbreviations are provided in the CSV file instead of their full names. Your task is to write a program which fulfills the following requirements.
Table 1: Facial landmarks' identification numbers and their details.
Landmark
Location
Left Outer eye corner Left Inner eye corner
Midline of nasal root @ nasofrontal structure Right Inner eye corner
Right Outer eye corner Nose Tip
Lateral-most point of left nose contour
Lateral-most point of right nose contour
Base of left nose contour Base of right nose contour Left outer mouth corner Right outer mouth corner Root of nasal midline
Left forehead Right forehead
Facial distance Abvn
Forehead width FW Outer-canthal width OCW Left Eye fissure length LEFL Right Eye fissure length REFL Inter canthal width ICW Nose width NW Alar-base width ABW Mouth width MW Nasal bridge length NBL Nose height NH
Figure 1: Facial landmarks’ locations on the face.
LM1 LM2
FT_L FT_R Ex_L Ex_R Ex_L En_L En_R Ex_R En_L En_R Al_L Al_R Sbal_L Sbal_R Ch_L Ch_R N Prn
N Sn
Table 2: List of distances to be calculated in this project. For example, Forehead width is abbreviated as ‘FW’ and is the 3D Euclidean distance between Left Forehead (FT_L) and Right Forehead (FT_R). Similarly Inner canthal width, abbreviated as ‘ICW’ is the distance between the two inner eye corners: En_L and En_R.
Specification: What your program is required to do
Input:
Your program must define the function main with the following syntax: def main(csvfile, adultIDs):
The input arguments to this function are:
• csvfile: The name of the CSV file containing the facial record which needs to be analysed. Below are the first two rows of the sample file.
Adult ID A0001
Landmark X Y Z Ex_L -32.8506 -39.073 4.64672
The first row of the CSV file contains the following headers:
- § Adult ID: The de-identified ID of an adult.
- § Landmark: The facial landmark as mentioned in Table 1.
- § “X”,“Y”and“Z”:The3DlocationofthelandmarkinX,YandZaxesrespectively.
We do not have prior knowledge about the number of adults we have to analyse (i.e. the number of rows) that the CSV file contains. Also we are not aware of the order of the columns, so your program needs to check for the column heading to retrieve respective information. The columns ‘Adult ID’ and ‘Landmark’ are strings data while the remaining data is numeric.
Note: The X, Y and Z coordinates are in millimetres and needs to be within the bounds [-200,200].
• adultIDs: A list containing two adult IDs, which need to be analysed. Remember that the ID is a string and is case insensitive.
Output:
The function is required to return the following outputs in the order provided below. For ease of description, we will refer to the input adult IDs as “F1” and “F2”.
- OP1: A list of two dictionaries containing the facial distances (as mentioned in Table-2) for each face F1 and F2 respectively. The keys in the dictionaries are the abbreviations (case- sensitive) of the distances (e.g. FW, ICW etc.) and their values contain the 3D Euclidean distance between the corresponding landmarks (see last two columns of Table-2). The formula to calculate the Euclidean distance between two 3D landmarks is given at the end of this project sheet.
- OP2: The cosine similarity between faces F1 and F2 based on the ten distances calculated above. The formula to calculate cosine similarity is provided at the end of this project sheet.
- OP3: A list of two Tuple sequences. The first sequence contains the cosine similarity between face F1 and the five faces closest/most similar to F1 excluding face F2 (based on cosine similarity calculated using the ten distances). The second sequence has the same information for face F2 excluding face F1. In each tuple, the first member of each tuple is the “Adult ID” of the face while the second member is the cosine distance between this face and the reference face. The sequences must be arranged in the decreasing order of cosine similarity but if the cosine distance for two faces is exactly same then they should be arranged in alphabetical order of their Adult ID.
- OP4: A list of two dictionaries containing the average of each of the ten facial distances (See Table-2) of the closest five faces (Output 3) for the reference faces F1 and F2. The keys in the dictionaries are the abbreviations of the distances (e.g. FW, ICW etc) and their values contain the average of 3D Euclidean distance of all five closest faces.
All returned numeric outputs (both in lists and individual) must contain values rounded to four decimal places (if required to be rounded off). Do not round the values during calculations and round them only at the time that you save them into the final output variables.
Examples:
Download sample_face_data.csv file from the folder of Project 2 on LMS or Moodle. Some examples of how you can call your program from the Python shell (and examine the results it returns) are:
>>> OP1, OP2, OP3, OP4 = main('sample_face_data.csv',['R7033', 'P1283'])
The output variables returned are:
>>> OP1 = [{'FW': 128.0695, 'OCW': 93.9636, 'LEFL': 33.092, 'REFL': 32.6327, 'ICW': 32.1305, 'NW': 31.7842, 'ABW': 18.3002, 'MW': 46.8504, 'NBL': 40.0941, 'NH': 62.7722}, {'FW': 123.9306, 'OCW': 100.081, 'LEFL': 34.4401, 'REFL': 32.3075, 'ICW': 37.3563, 'NW': 43.815, 'ABW': 28.1968, 'MW': 63.467, 'NBL': 38.416, 'NH': 60.9737}]
>>> OP2 0.9932
>>> OP3
[[('L8682', 0.9995), ('C9721', 0.9993), ('J0951', 0.9993), ('K5219', 0.9992), ('N0889', 0.9991)], [('A5474', 0.9991), ('D2742', 0.9987), ('H1286', 0.9982), ('G8293', 0.9976), ('U0216', 0.9975)]]
>>> OP4
[{'FW': 125.5241, 'OCW': 91.6592, 'LEFL': 31.7918, 'REFL': 31.2151, 'ICW': 31.2196, 'NW': 32.9661, 'ABW': 19.4147, 'MW': 48.1833, 'NBL': 39.1743, 'NH': 61.2486}, {'FW': 123.5339, 'OCW': 98.5101, 'LEFL': 31.9901, 'REFL': 32.1118, 'ICW': 36.7311, 'NW': 42.4031, 'ABW': 27.3566, 'MW': 57.6451, 'NBL': 42.074, 'NH': 63.3527}]