CourseNana | CSCI561 - Spring 2024 - Foundations of Artificial Intelligence Homework 3: Multi-layer perceptron

CSCI-561 - Spring 2024 - Foundations of Artificial Intelligence Homework 3 CourseNana.COM

1. Assignment Overview CourseNana.COM

In this homework assignment, you will implement a multi-layer perceptron (MLP) and use it to solve a classification task on real-world data from the New York housing market. Your algorithm will be implemented from scratch, using no external libraries other than Numpy (or similar); machine learning libraries are NOT allowed (e.g., Sklearn, TensorFlow, PyTorch, etc.). This dataset is publicly available, but we modified it. So only use our data. CourseNana.COM

2. Grading CourseNana.COM

Your final score for this project will be a combination of the following three items: CourseNana.COM

1. (70%) Primary prediction task: the bulk of your homework score will be determined by your model’s performance relative to a benchmark model on the central classification task (see Section 4 for task details). In particular, the test accuracy of your model will be compared to the test accuracy of our model, and your grade will be determined by the following rubric: CourseNana.COM

- < 50% baseline accuracy → 0% grade CourseNana.COM
- >= 50% baseline accuracy → 20% grade CourseNana.COM
- >= 60% baseline accuracy → 40% grade CourseNana.COM
- >= 70% baseline accuracy → 60% grade CourseNana.COM
- >= 80% baseline accuracy → 80% grade CourseNana.COM
- >= 90% baseline accuracy → 100% grade CourseNana.COM

That is, for each 10% bump in accuracy your model achieves relative to ours, your score for this section (which itself is worth 70% of the HW grade) will increase by 20%, so long as you’ve achieved 50% of the CourseNana.COM

baseline accuracy. The scoring is tiered in this way due to the relative difficulties of reaching each performance level; achieving the first 50% of baseline accuracy is about as hard as the final 10%, for instance. Your model will be scored according to the highest category for which it’s eligible. CourseNana.COM

For clarity, the following is an example of how a submission will be graded: CourseNana.COM

- Ten different train/test splits of the New York Housing dataset are generated. Then for each split: CourseNana.COM
- Both your submitted model and our baseline model are trained, from scratch, on the training set. CourseNana.COM
- Both models are then evaluated on the test set, producing the test classification accuracies A for CourseNana.COM

your model and B for our baseline. CourseNana.COM
- The percentage of baseline accuracy for your model is then determined by calculating A / B. For CourseNana.COM

example, if your model reached 70% classification accuracy and our baseline reached 85%, your CourseNana.COM

relative score is 70 / 85 = 0.824, or 82.4% of baseline accuracy. CourseNana.COM
- After all ten train/test splits are processed, your average percentage of baseline accuracy is CourseNana.COM

computed. CourseNana.COM
- Your average relative accuracy is mapped to its corresponding rubric score. In the example, 80 <= CourseNana.COM

82.4 < 90, so the submission would receive a score of 80% for this grading section (which again is worth 70% of the final HW grade). CourseNana.COM

Important: feedback prior to deadline. The above grading process describes how we will generate your submission’s final score after the deadline has passed. Prior to the deadline, we will provide 5 sample train/test splits, in resources/asnlib/publicdata/dev/ on Vocareum. By default, the first split will be used each time you submit to Vocareum. You can instruct the Vocareum submission script to use another one by adding the following commented-out line anywhere in your code: CourseNana.COM

- In Python: CourseNana.COM
```
              # USE_DATASET_SPLIT x
```
- In C++ or Java: CourseNana.COM

// USE_DATASET_SPLIT x
Where x is a number from 1 to 5 (inclusive). All splits are ~70% training / ~30% test data. CourseNana.COM

2. (30%) Hyperparameter report: you will also receive credit for providing a report.pdf file in your Vocareum submission that indicates any hyperparameter exploration you have done during development. This file should include the validation scores you received for various network settings you’ve tried, such as number of layers & nodes per layer (e.g., deep vs wide networks), learning rate, activation function (e.g., RELU, sigmoid, etc). The format is up to you, this will be graded by a human. Typically, you would briefly explain what hyperparameter you explored, and then, for each value, show the obtained results on the 5 data splits. See section 6 for guidance. CourseNana.COM

3. Data Description CourseNana.COM

You will train your model on data from the New York housing market. This dataset includes 4801 real estate sales in the region, each with the following 17 attributes: CourseNana.COM

- BROKERTITLE: Title of the broker CourseNana.COM
- TYPE: Type of the house CourseNana.COM
- PRICE: Price of the house CourseNana.COM
- ADDRESS: Full address f the house CourseNana.COM
- STATE: State of the house CourseNana.COM
- MAIN_ADDRESS: Main address information CourseNana.COM
- ADMINISTRATIVE_AREA_LEVEL_2: Administrative area level 2 information CourseNana.COM
- LOCALITY: Locality information CourseNana.COM
- SUBLOCALITY: Sublocality information CourseNana.COM
- STREET_NAME: Street name CourseNana.COM
- LONG_NAME: Long name CourseNana.COM
- FORMATTED_ADDRESS: Formatted address CourseNana.COM
- LATITUDE: Latitude coordinate of the house CourseNana.COM
- LONGITUDE: Longitude coordinate of the house CourseNana.COM

Your goal is to predict the number of bedrooms (the “BEDS” feature) for a given property, using any of the other 16 available features. The dataset will be provided to you in a CSV-like format (see exact format in Section 4), so you can perform local analyses and design feature transformations as needed (e.g., one-hot encoding certain discrete-valued features). CourseNana.COM

4. Task description CourseNana.COM

Your task is to implement a multi-layer neural network learner (see Section 5 for additional details), that will do the following: CourseNana.COM

Construct and train a neural network classifier using provided labeled training data, CourseNana.COM
Use the learned classifier to classify unlabeled test inputs, CourseNana.COM
Output the predictions of your classifier on the test data into a file in the same directory, CourseNana.COM
Finish in 5 minutes on Vocareum (for both training your model and making predictions). CourseNana.COM

For step #1, your program will read from the provided train_data.csv and train_label.csv files in the current directory, providing the input features and output labels, respectively, to use for training. Once your model has finished training, your program must read test_data.csv to obtain the test set data points, and produce predictions by evaluating your model for each. Your program will then write a single output.csv file containing the corresponding predictions, one per line. CourseNana.COM

4a. File structure CourseNana.COM

Your program must be named homework.py, homework.cpp, or homework.java as in previous homeworks. It will be compiled and then invoked without any command-line arguments. Three files will be present in CourseNana.COM

the current directory when your program is invoked: train_data.csv and , for the input features and output labels to use for training. After training, your program must read to obtain the test set data points, and produce predictions in a new file output.csv CourseNana.COM

The format of *_data.csv looks like: CourseNana.COM

BROKERTITLE, TYPE, PRICE, ... (16 column labels) x1(1), x1(2), ...
CourseNana.COM

where yi is the integer-valued label for data point xi. Thus, there is a single column indicating the predicted class label for each unlabeled sample in the input test file, i.e., the predicted number of bedrooms. CourseNana.COM

The format of your output.csv file is crucial. Your output file must have this name and format so that it can be parsed correctly to compare with true labels by the auto-grading scripts. This file should be written to your working path. CourseNana.COM

4b. Implementation constraints CourseNana.COM

As mentioned above, NumPy is the only external library you can use in your implementation (or equivalent numerical computing-only library in non-Python languages). By external we mean outside the standard library (e.g., in Python, random, os, etc are fine to use). No component of the neural network implementation can leverage a call to an external ML library; you must implement the algorithm yourself, from scratch. CourseNana.COM

The maximum running time to train and test your model is 5 minutes on Vocareum. The size of the training set will remain roughly fixed across Vocareum evaluations, providing ample opportunity to tune your submission according to its efficiency (e.g., by adjusting the number of training epochs). Here it is particularly important to vectorize your implementation (see Section 6). CourseNana.COM

train_label.csv CourseNana.COM

test_data.csv CourseNana.COM

5. Model description CourseNana.COM

The model you will implement is a vanilla feed-forward neural network, possibly with many hidden layers (see Figure 2 for a generic depiction). Your network should output a single value, and may have variable input size depending on your design (e.g., whether you transform any of the 16 provided features). Beyond this, there are no constraints on your model’s structure; it is up to you to decide what activation function, number of hidden layers, number of nodes per hidden layer, etc, your model should use. CourseNana.COM

Figure 2: Diagram of an example neural network with 3 hidden layers. CourseNana.COM

There are many hyperparameters you will likely need to tune to get better performance. These can be hard-coded by you in your program (possibly after structured exploration of your hyperparameter space), or selected through a cross validation process dynamically (in the latter case, be wary of runtime limits). A few example hyperparameters are as follows: CourseNana.COM

- Learning rate: step size to update weights (e.g. weights = weights - learning * grads), different optimizers have different ways to use learning rate. CourseNana.COM
- Mini-batch size: number of samples processed each time before the model is updated. The mini-batch size is some value smaller than the size of the dataset that effectively splits it into smaller chunks during training. Using batches to train your network is highly recommended. CourseNana.COM
- Number of the epochs: the number of complete passes through the training dataset (e.g. if you have 1000 samples, 20 epochs mean you loop through these 1000 samples 20 times). CourseNana.COM
- Number of hidden layers & number of units in each hidden layer: these settings constitute the overall structure of your model. Unnecessarily deep or wide networks may negatively impact your model’s performance given the time constraints. Here you need to find a proper tradeoff between feasible time to convergence and model expressivity. CourseNana.COM

6. Implementation and report.pdf guidance CourseNana.COM

Here are a few suggestions you might want to consider during your implementation: CourseNana.COM

1. Think about how to deal with text data. In the dataset, some entries are numeric (e.g., PRICE), others are text with only a few alternative choices (e.g., TYPE), and others are more open English text (e.g., CourseNana.COM

BROKERTITLE, ADDRESS). You need to think about how to use text-based inputs in your neural network. This CourseNana.COM

will likely require some custom encoding of your choice. CourseNana.COM

Train your model using mini-batches: there are many good reasons to use mini-batches to train your model CourseNana.COM

(instead of individual points or the entire dataset at once), including benefits to performance and CourseNana.COM

convergence. CourseNana.COM
Initialize weights and biases: employ a proper random initialization scheme for your weights and biases. This CourseNana.COM

can have a large impact on your final model. CourseNana.COM
Use backpropagation: you should be using backpropagation along with a gradient descent-based CourseNana.COM

optimization algorithm to update your network’s weights during training. CourseNana.COM
Vectorize your implementation: vectorizing your implementation can have a large impact on performance. CourseNana.COM

Use vector/matrix operations when possible instead of explicit programmatic loops. CourseNana.COM
Regularize your model: leverage regularization techniques to ensure your model doesn’t overfit the training CourseNana.COM

data and keeps model complexity in check. CourseNana.COM
Plot your learning curve: plotting your train/test accuracy after CourseNana.COM

each epoch is a quick and helpful way to see how your network is performing during training. Here you are allowed to use external plotting libraries, but worth noting that you should likely remove them prior to submission for performance reasons. The figure on the right shows a generic example of such a plot; your plot(s) may look different. CourseNana.COM
Putting it all together: see Figure 3 on the next page for a basic depiction of an example training pipeline. Note that this diagram lacks detail and is only meant to provide a rough outline for how your training loop might look. CourseNana.COM

While recommended, the use of these suggestions in your implementation is not explicitly required. Your grade will be determined by your model’s performance as described in the grading section. CourseNana.COM

7. Academic Honesty and Integrity CourseNana.COM

All homework material is checked vigorously for dishonesty using several methods. All detected violations of academic honesty are forwarded to the Office of Student Judicial Affairs. To be safe, you are urged to err on the side of caution. Do not copy work from another student or off the web. Keep in mind that sanctions for dishonesty are reflected in your permanent record and can negatively impact your future success. As a general guide: CourseNana.COM

● Do not copy code or written material from another student. Even single lines of code should not be copied. CourseNana.COM
● Do not collaborate on this assignment. The assignment is to be solved individually. CourseNana.COM
● Do not copy code off the web. This is easier to detect than you may think. CourseNana.COM
● Do not share any custom test cases you may create to check your program’s behavior in more complex CourseNana.COM

Do not post your code on Piazza asking whether or not it is correct. This is a violation of academic integrity CourseNana.COM
because it biases other students who may read your post. CourseNana.COM
● Do not post test cases on Piazza asking for what the correct solution should be. CourseNana.COM
● Do ask the professor or TAs if you are unsure about whether certain actions constitute dishonesty. It is CourseNana.COM

better to be safe than sorry. CourseNana.COM
● DO NOT USE ANY existing machine learning library such as Tensorflow, Pytorch, Scikit-Learn, etc. Violation CourseNana.COM

will cause a penalty to your credit. CourseNana.COM

CSCI561 - Spring 2024 - Foundations of Artificial Intelligence Homework 3: Multi-layer perceptron

Get in Touch with Our Experts