1. Homepage
  2. Programming
  3. BUSS6002 Data Science in Business - Assignment: Compare linear basis function (LBF) models

BUSS6002 Data Science in Business - Assignment: Compare linear basis function (LBF) models

Engage in a Conversation
BUSS6002SydneyData Science in Businesslinear basis functionPython

Instructions CourseNana.COM

BUSS6002 Assignment CourseNana.COM

October 10, 2023 CourseNana.COM

  • Due: at 23:59 on Friday, October 27, 2023 (end of week 12). CourseNana.COM

  • You must submit a written report (in PDF) with the following filename format, replacing CourseNana.COM

    STUDENTID with your own student ID: BUSS6002 STUDENTID.pdf. CourseNana.COM

  • You must also submit a Jupyter Notebook (.ipynb) file with the following filename format, CourseNana.COM

    replacing STUDENTID with your own student ID: BUSS6002 STUDENTID.ipynb. CourseNana.COM

  • There is a limit of 6 A4-pages for your report (including equations, tables, and captions). CourseNana.COM

  • All plots, computational tasks, and results must be completed using Python. CourseNana.COM

  • Each section of your report must be clearly labelled with a heading. CourseNana.COM

  • Do not include any Python code as part of your report. CourseNana.COM

  • All figures must be appropriately sized and have readable axis labels and legends (where applicable). CourseNana.COM

  • The submitted .ipynb file must contain all the code used in the development of your report. CourseNana.COM

  • The submitted .ipynb file must be free of any errors, and the results must be reproducible. CourseNana.COM

  • You may submit multiple times but only your last submission will be marked. CourseNana.COM

  • A late penalty applies if you submit your assignment late without a successful special con- sideration. See the Unit Outline for more details. CourseNana.COM

  • Generative AI tools (such as ChatGPT) may be used for this assignment but you must add a statement at the end of your report specifying how generative AI was used. E.g., Generative AI was used only used for editing the final report text. CourseNana.COM

  • Hint! It is highly recommended that you finish the week 10 tutorial before starting this assignment. CourseNana.COM

Description CourseNana.COM

In this assignment, you are conducting a study that compares the empirical performance between two families of basis functions for linear basis function (LBF) models: polynomial basis functions and radial basis functions. The aim is to investigate which family of basis functions is better suited for approximating highly nonlinear relationships between two scalar-valued variables. CourseNana.COM

More specifically, you are given four benchmark datasets: A, B, C, and D. Each dataset con- tains 5,000 observations of the the response and predictor variables, which are named y and x, respectively. A scatter plot of each dataset is shown in Figure 1. Your task is to compare the per- formance between polynomial and radial basis function regression models on each of the datasets. CourseNana.COM

Figure 1: Benchmark Datasets CourseNana.COM

The LBF model being considered in your study is given by y = φ(x)β + ε, CourseNana.COM

where φ(x) := [11(x),...,φp(x)], β := [β01,...,βp], and ε is a random noise. For the set of basis functions {φi}pi=1, two choices are being investigated: the first choice is the family of polynomial basis functions, CourseNana.COM

φi(x) := xi,
and the second choice is the family of radial basis functions, CourseNana.COM

( (xi )2) φi(x):=exp p+1 CourseNana.COM

Before comparing the two basis function families, you must set the value of p for the polynomial re- gression model, as well as the values of p and s for the radial basis function regression model. These hyperparameter values should be selected for each dataset, using a validation set, by minimising the validation mean squared error (MSE). CourseNana.COM

In your study, the optimal value of p (for each basis function family) should be selected by exhaustively searching through an equally-spaced grid from 1 to 10, with a spacing of 1: CourseNana.COM

P := {1,2,3,...,10}. CourseNana.COM

For the radial basis functions, in addition to selecting p, you should also select the optimal value of s by exhaustively searching through another equally-spaced grid from 0.1 to 1, with a spacing of 0.1: CourseNana.COM

S := {0.1,0.2,0.3,...,1}. CourseNana.COM

That is, for each dataset, the optimal values must be determined for three hyperparameters in total: ppol ∈ P, prad ∈ P, and s ∈ S, where ppol denotes the number of polynomial basis functions (i.e., the degree of the polynomial) and prad denotes the numbers of radial basis functions. CourseNana.COM

Once the optimal values of the hyperparameters are chosen for both basis function families, you will be able to compare the performance between the two using a test set (i.e., by comparing the test MSE between the two optimally selected models). CourseNana.COM

The files containing the datasets are listed in Table 1, which can be downloaded from the unit’s Canvas site. In each file, the dataset is organised as comma separated values, with each row being an observation and each column being a variable. The response values are on the first column and the corresponding predictor values on are the second column. CourseNana.COM

File CourseNana.COM

dataset-a.csv
dataset-b.csv
dataset-c.csv
dataset-d.csv

Description CourseNana.COM

Benchmark dataset A Benchmark dataset B Benchmark dataset C Benchmark dataset D CourseNana.COM

Table 1: Files Provided CourseNana.COM

Report Structure CourseNana.COM

Your report must contain the following four sections: 1 Introduction (0.5 pages) CourseNana.COM

  • –  Provide a brief project background so that the reader of your report can understand the general problem that you are solving. CourseNana.COM

  • –  Motivate your research question. CourseNana.COM

  • –  State the aim of your project. CourseNana.COM

  • –  Provide a short summary of each of the rest of the sections in your report (e.g., “The report proceeds as follows: Section 2 presents . . . ”). CourseNana.COM

    2 Methodology (2 pages) CourseNana.COM

    • –  Define and describe the LBF model. CourseNana.COM

    • –  Define and describe the two choices of basis function families being investigated. CourseNana.COM

    • –  Describe how the parameter vector β is estimated given the hyperparameter value(s). Mention any potential numerical issues associated with the estimation procedure. CourseNana.COM

    • –  Describe how the hyperparameter value(s) can be determined automatically from data (as opposed to manually setting the hyperparameters to arbitrary values). CourseNana.COM

    • –  Describe how the performance of the two families of basis functions is compared given the optimal hyperparameter value(s). CourseNana.COM

      3 Empirical Study (2.5 pages) CourseNana.COM

      • –  Describe the benchmark datasets used in your study. CourseNana.COM

      • –  Describe in detail the procedure that you followed to obtain the empirical results, in- cluding any computational challenges that you may have encountered. You may refer to details in Section 2 to avoid repetition in your writing. CourseNana.COM

      • –  Present (in a table) the optimal hyperparameter values selected for each dataset and for each basis function family. CourseNana.COM

      • –  Discuss the table of selected hyperparameters. CourseNana.COM

      • –  Visually present (using plots) the predicted response values under each basis function CourseNana.COM

        family for each dataset. CourseNana.COM

      • –  Discuss the plots of predicted values. CourseNana.COM

      • –  Present (in a table) the test MSE under each basis function family for each dataset. CourseNana.COM

      • –  Discuss the table of test MSE values. CourseNana.COM

        4 Conclusion (0.5 pages) CourseNana.COM

        Discuss your overall findings / insights.
        Discuss any limitations of your study.
        Suggest potential directions of extending your study. CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
BUSS6002代写,Sydney代写,Data Science in Business代写,linear basis function代写,Python代写,BUSS6002代编,Sydney代编,Data Science in Business代编,linear basis function代编,Python代编,BUSS6002代考,Sydney代考,Data Science in Business代考,linear basis function代考,Python代考,BUSS6002help,Sydneyhelp,Data Science in Businesshelp,linear basis functionhelp,Pythonhelp,BUSS6002作业代写,Sydney作业代写,Data Science in Business作业代写,linear basis function作业代写,Python作业代写,BUSS6002编程代写,Sydney编程代写,Data Science in Business编程代写,linear basis function编程代写,Python编程代写,BUSS6002programming help,Sydneyprogramming help,Data Science in Businessprogramming help,linear basis functionprogramming help,Pythonprogramming help,BUSS6002assignment help,Sydneyassignment help,Data Science in Businessassignment help,linear basis functionassignment help,Pythonassignment help,BUSS6002solution,Sydneysolution,Data Science in Businesssolution,linear basis functionsolution,Pythonsolution,