1. Homepage
  2. Programming
  3. CSE 158, CSE 258, DSC 256, MGTA 461 Web Mining and Recommender Systems, Fall 2023 : Assignment 1: Video Game Prediction

CSE 158, CSE 258, DSC 256, MGTA 461 Web Mining and Recommender Systems, Fall 2023 : Assignment 1: Video Game Prediction

Engage in a Conversation
UCSDCSE 158CSE 258DSC 256MGTA 461Web Mining and Recommender SystemsVideo Game PredictionPython

CSE 158/258, DSC 256, MGTA 461, Fall 2023: Assignment 1 Instructions CourseNana.COM

In this assignment you will build recommender systems to make predictions related to video game reviews from Steam. CourseNana.COM

Submissions will take the form of prediction files uploaded to gradescope, where their test set performance will be evaluated on a leaderboard. Most of your grade will be determined by ‘absolute’ cutoffs; the leaderboard ranking will only determine enough of your assignment grade to make the assignment FUN. CourseNana.COM

The assignment is due Monday, Nov 20, though make sure you upload solutions to the leaderboard regularly. CourseNana.COM

You should submit two files: CourseNana.COM

writeup.txt a brief, plain-text description of your solutions to each task; please prepare this adequately in advance of the submission deadline; this is only intended to help us follow your code and does not need to be detailed. CourseNana.COM

assignment1.py A python file containing working code for your solutions. The autograder will not execute your code; this file is required so that we can assign partial grades in the event of incorrect solutions, check for plagiarism, etc. Your solution should clearly document which sections correspond to each task. We may occasionally run code to confirm that your outputs match submitted answers, so please ensure that your code generates the submitted answers.1 CourseNana.COM

Along with two files corresponding to your predictions: CourseNana.COM

predictions Played.csv, predictions Hours.csv Files containing your predictions for each (test) instance (you should submit two of the above three files). The provided baseline code demonstrates how to generate valid output files. CourseNana.COM

To begin, download the files for this assignment from: CourseNana.COM


Files CourseNana.COM

train.json.gz 175,000 instances to be used for training. This data should be used for both the ‘play prediction’ and ‘time played prediction’ tasks. It is not necessary to use all observations for training, for example if doing so proves too computationally intensive. CourseNana.COM

userID The ID of the user. This is a hashed user identifier from Steam. gameID The ID of the game. This is a hashed game identifier from Steam. text Text of the user’s review of the game.
date Date when the review was entered. CourseNana.COM

hours How many hours the user played the game.
hours transformed log2(hours+1). This transformed value is the one we are trying to predict. CourseNana.COM

pairs Played.csv Pairs on which you are to predict whether a game was played.
pairs Hours.csv Pairs (userIDs and gameIDs) on which you are to predict time played.. baselines.py A simple baseline for each task, described below. CourseNana.COM

Please do not try to collect these reviews from Steam, or to reverse-engineer the hashing function I used to anonymize the data. Doing so will not be easier than successfully completing the assignment. We will run the code of any solution suspected of violating the competition rules, and you may be penalized if your code does produce your submitted solution. CourseNana.COM

1Don’t worry too much about dependencies if importing non-standard libraries. CourseNana.COM


You are expected to complete the following tasks: CourseNana.COM

Play prediction Predict given a (user,game) pair from ‘pairs Played.csv’ whether the user would play the game (0 or 1). Accuracy will be measured in terms of the categorization accuracy (fraction of correct predictions). The test set has been constructed such that exactly 50% of the pairs correspond to played games and the other 50% do not. CourseNana.COM

Time played prediction Predict how long a person will play a game (transformed as log2(hours + 1), for those (user,game) pairs in ‘pairs Hours.csv’. Accuracy will be measured in terms of the mean-squared error (MSE). CourseNana.COM

A competition page has been set up on Kaggle to keep track of your results compared to those of other members of the class. The leaderboard will show your results on half of the test data, but your ultimate score will depend on your predictions across the whole dataset. CourseNana.COM

Grading and Evaluation CourseNana.COM

This assignment is worth 22% of your grade. You will be graded on the following aspects. Each of the two tasks is worth 10 marks (i.e., 10% of your grade), plus 2 marks for the written report. CourseNana.COM

Your ability to obtain a solution which outperforms the leaderboard baselines on the unseen portion of the test data (5 marks for each task). Obtaining full marks requires a solution which is substantially better than baseline performance. CourseNana.COM

Your ranking for each of the tasks compared to other students in the class (3 marks for each task).
Obtain a solution which outperforms the baselines on the seen portion of the test data (i.e., the leader- CourseNana.COM

board). This is a consolation prize in case you overfit to the leaderboard. (2 mark for each task). CourseNana.COM

Finally, your written report should describe the approaches you took to each of the tasks. To obtain good performance, you should not need to invent new approaches (though you are more than welcome to!) but rather you will be graded based on your decision to apply reasonable approaches to each of the given tasks (2 marks total). CourseNana.COM

Baselines CourseNana.COM

Simple baselines have been provided for each of the tasks. These are included in ‘baselines.py’ among the files above. They are mostly intended to demonstrate how the data is processed and prepared for submission to Gradescope. These baselines operate as follows: CourseNana.COM

Play prediction Find the most popular games that account for 50% of interactions in the training data. Return ‘1’ whenever such a game is seen at test time, ‘0’ otherwise. CourseNana.COM

Time played prediction Return the global average time, or the user’s average if we have seen them before in the training data. CourseNana.COM

Running ‘baselines.py’ produces files containing predicted outputs (these outputs can be uploaded to Grade- scope). Your submission files should have the same format. CourseNana.COM

Get in Touch with Our Experts

Wechat WeChat
Whatsapp Whatsapp
UCSD代写,CSE 158代写,CSE 258代写,DSC 256代写,MGTA 461代写,Web Mining and Recommender Systems代写,Video Game Prediction代写,Python代写,UCSD代编,CSE 158代编,CSE 258代编,DSC 256代编,MGTA 461代编,Web Mining and Recommender Systems代编,Video Game Prediction代编,Python代编,UCSD代考,CSE 158代考,CSE 258代考,DSC 256代考,MGTA 461代考,Web Mining and Recommender Systems代考,Video Game Prediction代考,Python代考,UCSDhelp,CSE 158help,CSE 258help,DSC 256help,MGTA 461help,Web Mining and Recommender Systemshelp,Video Game Predictionhelp,Pythonhelp,UCSD作业代写,CSE 158作业代写,CSE 258作业代写,DSC 256作业代写,MGTA 461作业代写,Web Mining and Recommender Systems作业代写,Video Game Prediction作业代写,Python作业代写,UCSD编程代写,CSE 158编程代写,CSE 258编程代写,DSC 256编程代写,MGTA 461编程代写,Web Mining and Recommender Systems编程代写,Video Game Prediction编程代写,Python编程代写,UCSDprogramming help,CSE 158programming help,CSE 258programming help,DSC 256programming help,MGTA 461programming help,Web Mining and Recommender Systemsprogramming help,Video Game Predictionprogramming help,Pythonprogramming help,UCSDassignment help,CSE 158assignment help,CSE 258assignment help,DSC 256assignment help,MGTA 461assignment help,Web Mining and Recommender Systemsassignment help,Video Game Predictionassignment help,Pythonassignment help,UCSDsolution,CSE 158solution,CSE 258solution,DSC 256solution,MGTA 461solution,Web Mining and Recommender Systemssolution,Video Game Predictionsolution,Pythonsolution,