1. Homepage
  2. Programming
  3. CS 229 Machine Learning - Summer 2019 Problem Set 2: Logistic Regression: Training stability

CS 229 Machine Learning - Summer 2019 Problem Set 2: Logistic Regression: Training stability

Engage in a Conversation
Stanford UniversityCS 229Machine LearningLogistic RegressionTraining stabilityPython

CS229 Problem Set #2 1 CourseNana.COM

CS 229, Summer 2019 CourseNana.COM

Problem Set #2 CourseNana.COM

  CourseNana.COM

Due Monday, July 29 at 11:59 pm on Gradescope. CourseNana.COM

  CourseNana.COM

Notes: (1) These questions require thought, but do not require long answers. Please be as CourseNana.COM

concise as possible. (2) If you have a question about this homework, we encourage you to post CourseNana.COM

your question on our Piazza forum, at http://piazza.com/stanford/summer2019/cs229. (3) CourseNana.COM

If you missed the  rst lecture or are unfamiliar with the collaboration or honor code policy, CourseNana.COM

please read the policy on the course website before starting work. (4) For the coding problems, CourseNana.COM

you may not use any libraries except those de ned in the provided environment.yml  le. In CourseNana.COM

particular, ML-speci c libraries such as scikit-learn are not permitted. (5) To account for late CourseNana.COM

days, the due date is Monday, July 29 at 11:59 pm. If you submit after Monday, July 29 at CourseNana.COM

11:59 pm, you will begin consuming your late days. If you wish to submit on time, submit before CourseNana.COM

Monday, July 29 at 11:59 pm. CourseNana.COM

  CourseNana.COM

All students must submit an electronic PDF version of the written questions. We highly rec- CourseNana.COM

ommend typesetting your solutions via LATEX. All students must also submit a zip  le of their CourseNana.COM

source code to Gradescope, which should be created using the make zip.py script. You should CourseNana.COM

make sure to (1) restrict yourself to only using libraries included in the environment.yml  le, CourseNana.COM

and (2) make sure your code runs without errors. Your submission may be evaluated by the CourseNana.COM

auto-grader using a private test set, or used for verifying the outputs reported in the writeup. CourseNana.COM

  CourseNana.COM

1. [15 points] Logistic Regression: Training stability CourseNana.COM

  CourseNana.COM

In this problem, we will be delving deeper into the workings of logistic regression. The goal of CourseNana.COM

this problem is to help you develop your skills debugging machine learning algorithms (which CourseNana.COM

can be very different from debugging software in general). CourseNana.COM

  CourseNana.COM

We have provided an implementation of logistic regression in src/stability/stability.py, CourseNana.COM

and two labeled datasets A and B in src/stability/ds1 a.csv and src/stability/ds1 b.csv. CourseNana.COM

Please do not modify the code for the logistic regression training algorithm for this problem. CourseNana.COM

First, run the given logistic regression code to train two di erent models on A and B. You can CourseNana.COM

run the code by simply executing python stability.py in the src/stability directory. CourseNana.COM

(a) [2 points] What is the most notable di erence in training the logistic regression model on CourseNana.COM

datasets A and B? CourseNana.COM

(b) [5 points] Investigate why the training procedure behaves unexpectedly on dataset B, but CourseNana.COM

not on A. Provide hard evidence (in the form of math, code, plots, etc.) to corroborate CourseNana.COM

your hypothesis for the misbehavior. Remember, you should address why your explanation CourseNana.COM

does not apply to A. CourseNana.COM

Hint: The issue is not a numerical rounding or over/underow error. CourseNana.COM

(c) [5 points] For each of these possible modifications, state whether or not it would lead to CourseNana.COM

the provided training algorithm converging on datasets such as B. Justify your answers. CourseNana.COM

i. Using a different constant learning rate. CourseNana.COM

ii. Decreasing the learning rate over time (e.g. scaling the initial learning rate by 1=t2, CourseNana.COM

where t is the number of gradient descent iterations thus far). CourseNana.COM

iii. Linear scaling of the input features. CourseNana.COM

iv. Adding a regularization term k k22 CourseNana.COM

to the loss function. CourseNana.COM

v. Adding zero-mean Gaussian noise to the training data or labels. CourseNana.COM

(d) [3 points] Are support vector machines, vulnerable to datasets like B? Why or why not? CourseNana.COM

Give an informal justi cation. CourseNana.COM

  CourseNana.COM

2. [22 points] Spam classification CourseNana.COM

  CourseNana.COM

In this problem, we will use the naive Bayes algorithm and an SVM to build a spam classi er. CourseNana.COM

In recent years, spam on electronic media has been a growing concern. Here, we'll build a CourseNana.COM

classi er to distinguish between real messages, and spam messages. For this class, we will be CourseNana.COM

building a classi er to detect SMS spam messages. We will be using an SMS spam dataset CourseNana.COM

developed by Tiago A. Almedia and Jos e Mar  a G omez Hidalgo which is publicly available on CourseNana.COM

http://www.dt.fee.unicamp.br/~tiago/smsspamcollection 1 CourseNana.COM

  CourseNana.COM

We have split this dataset into training and testing sets and have included them in this assignment CourseNana.COM

as src/spam/spam train.tsv and src/spam/spam test.tsv. See src/spam/spam readme.txt CourseNana.COM

for more details about this dataset. Please refrain from redistributing these dataset  les. The CourseNana.COM

goal of this assignment is to build a classi er from scratch that can tell the di erence the spam CourseNana.COM

and non-spam messages using the text of the SMS message. CourseNana.COM

  CourseNana.COM

(a) [5 points] Implement code for processing the the spam messages into numpy arrays that can CourseNana.COM

be fed into machine learning models. Do this by completing the get words, create dictionary, CourseNana.COM

and transform text functions within our provided src/spam.py. Do note the correspond- CourseNana.COM

ing comments for each function for instructions on what speci c processing is required. CourseNana.COM

The provided code will then run your functions and save the resulting dictionary into CourseNana.COM

spam dictionary and a sample of the resulting training matrix into CourseNana.COM

spam sample train matrix. CourseNana.COM

In your writeup, report the vocabular size after the pre-processing step. You do not need CourseNana.COM

to include any other output for this subquestion. CourseNana.COM

(b) [10 points] In this question you are going to implement a naive Bayes classi er for spam CourseNana.COM

classi cation with multinomial event model and Laplace smoothing (refer to class notes CourseNana.COM

on Naive Bayes for details on Laplace smoothing in Section 2.3 of notes2.pdf). CourseNana.COM

Code your implementation by completing the fit naive bayes model and CourseNana.COM

predict from naive bayes model functions in src/spam/spam.py. CourseNana.COM

Now src/spam/spam.py should be able to train a Naive Bayes model, compute your predic- CourseNana.COM

tion accuracy and then save your resulting predictions to spam naive bayes predictions. CourseNana.COM

In your writeup, report the accuracy of the trained model on the test set. CourseNana.COM

Remark. If you implement naive Bayes the straightforward way, you will  nd that the CourseNana.COM

computed often equals zero. This is because p(xjy), which is the CourseNana.COM

product of many numbers less than one, is a very small number. The standard computer CourseNana.COM

representation of real numbers cannot handle numbers that are too small, and instead CourseNana.COM

rounds them o  to zero. (This is called \underow.") You'll have to  nd a way to compute CourseNana.COM

Naive Bayes' predicted class labels without explicitly representing very small numbers such CourseNana.COM

as p(xjy). [Hint: Think about using logarithms.] CourseNana.COM

(c) [5 points] Intuitively, some tokens may be particularly indicative of an SMS being in a CourseNana.COM

particular class. We can try to get an informal sense of how indicative token i is for the CourseNana.COM

SPAM class by looking at: CourseNana.COM

  CourseNana.COM

Complete the get top five naive bayes words function within the provided code using CourseNana.COM

the above formula in order to obtain the 5 most indicative tokens. CourseNana.COM

Report the top  ve words in your writeup. CourseNana.COM

(d) [2 points] Support vector machines (SVMs) are an alternative machine learning model that CourseNana.COM

we discussed in class. We have provided you an SVM implementation (using a radial basis CourseNana.COM

function (RBF) kernel) within src/spam/svm.py (You should not need to modify that CourseNana.COM

code). CourseNana.COM

One important part of training an SVM parameterized by an RBF kernel (a.k.a Gaussian CourseNana.COM

kernel) is choosing an appropriate kernel radius parameter. CourseNana.COM

Complete the compute best svm radius by writing code to compute the best SVM radius CourseNana.COM

which maximizes accuracy on the validation dataset. Report the best kernel radius you CourseNana.COM

obtained in the writeup. CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
Stanford University代写,CS 229代写,Machine Learning代写,Logistic Regression代写,Training stability代写,Python代写,Stanford University代编,CS 229代编,Machine Learning代编,Logistic Regression代编,Training stability代编,Python代编,Stanford University代考,CS 229代考,Machine Learning代考,Logistic Regression代考,Training stability代考,Python代考,Stanford Universityhelp,CS 229help,Machine Learninghelp,Logistic Regressionhelp,Training stabilityhelp,Pythonhelp,Stanford University作业代写,CS 229作业代写,Machine Learning作业代写,Logistic Regression作业代写,Training stability作业代写,Python作业代写,Stanford University编程代写,CS 229编程代写,Machine Learning编程代写,Logistic Regression编程代写,Training stability编程代写,Python编程代写,Stanford Universityprogramming help,CS 229programming help,Machine Learningprogramming help,Logistic Regressionprogramming help,Training stabilityprogramming help,Pythonprogramming help,Stanford Universityassignment help,CS 229assignment help,Machine Learningassignment help,Logistic Regressionassignment help,Training stabilityassignment help,Pythonassignment help,Stanford Universitysolution,CS 229solution,Machine Learningsolution,Logistic Regressionsolution,Training stabilitysolution,Pythonsolution,