In this assignment, you will implement and evaluate machine learning techniques for a given data modelling problem. You will download the given dataset named CW1_data_202223.csv and the essential information about the dataset from Learning Central. Your tasks will include data exploration, data pre-processing, machine learning method selection and implementation, and model performance evaluation. In addition to the aforementioned tasks, you will write a concise report (around 1000 words excluding tables and figures, and a maximum of five pages in total including tables and figures) to summarise your work and provide an analysis and discussion of the results.
i) Data exploration [15%]: Conduct exploratory inspection of the dataset to provide a good understanding of data characteristics. ii) Data pre-processing [30%]: Carry out well thought pre-processing procedures to prepare the data into a form that is likely to lead to better performance. iii) Model implementation [30%]: Select three representative classification methods with a clear justification of your choice. Implement and optimise the classifiers for your chosen classification methods. iv) Performance evaluation [10%]:
Organise the data in a suitable form to ensure the trained classifiers to provide reliable results. Evaluate models using suitable performance metrics. v) Result analysis and discussion [15%]: Provide an insightful analysis and comparison on results that you obtained from above steps, draw conclusions based on the results and analysis. Learning Outcomes Assessed Completion of this coursework allows students to demonstrate that they can:
- Implement and evaluate machine learning methods to solve a given task
- Explain the basic principles underlying common machine learning methods
- Choose an appropriate machine learning method and data pre-processing strategy to address the needs of a given application setting
- Reflect on the importance of data representation for the success of machine learning methods
Feedback and suggestion for future learning Feedback on your coursework will address the above criteria. Feedback and marks will be returned on 10 February 2023 via Learning Central. There will be opportunity for individual feedback during an agreed time. Feedback for this assignment will be useful for subsequent skills development, such as data science, natural language processing and deep learning (which will be studied during the second semester). Submission Instructions Your submission must include: • A Jupyter Notebook (.ipynb) file containing all your code and execution outputs/figures. • A typeset PDF report (see next section for details). Ensure that your student number is included on the report and as a comment at the top of each Python file that makes up your submission. You must submit to Learning Central two files (each named using your student number in the format of [student number]_CW1_code.ipynb, e.g., C1234567_CW1_code.ipynb) which contains: Description Type Name Python Code Compulsory One Jupyter Notebook (.ipynb) file. [student number]_CW1_code.ipynb Report Compulsory One PDF (.pdf) file containing your report. [student number]_CW1_report.pdf Before submitting your Jupyter Notebook file (.ipynb), make sure to restart the kernel and execute each cell such that all outputs and figures are visible. Any code submitted will be run in Python 3 and must be submitted as stipulated in the instructions above. Any deviation from the submission instructions above (including the number and types of files submitted) will result in a reduction in marks for that assessment or question part of 20%. Staff reserve the right to invite students to a meeting to discuss coursework submissions