1. Homepage
2. Programming
3. MGSC 416 Data-driven Models for Operations Analytics - Problem Set 6: Multi-armed Bandit

MGSC 416 Data-driven Models for Operations Analytics - Problem Set 6: Multi-armed Bandit

MGSC 416Data-driven Models for Operations AnalyticsMulti-armed Bandit

MGSC 416, Winter 2023 Data-driven Models for Operations Analytics

Problem Set 6 – Individual Assignment

Consider a pharmaceutical company that is developing a drug to decrease cholesterol levels. It has developed three prototypes of a drug from which they wish to choose one prototype to take to the market. We model the drug testing trial as a multi-armed bandit problem. Each prototype is an arm that we “pull” if we test the prototype on a patient. For each patient i, we choose a prototype and take the “reward” - a binary value that is 1 if the drug works on the patient and 0 if it does not work. In this problem, you will need to implement multi-armed bandit algorithms to maximize the total reward. You may use the code given in class to adapt the algorithms to the assignment.

1. We consider three algorithms: the ε-greedy, the ε-decreasing and the Thomson-sampling algorithms. For each algorithm, which parameters do we need to select? (3 pts)
2. For various values of the parameters in the previous question (for example, ε ∈ [0.01,0.02,....,0.6]), run the ε-greedy, the ε-decreasing and the Thomson-sampling algorithms on the training dataset, Training.csv, which represents simulated test scores from 250 25-35 year old adults. Assume that each drug j is effective on every member of this test population independently with probability pj, which is unknown. Report the sum of rewards that you get from the three strategies, and thus select the best values for the parameters for each of the algorithms. (14 pts) Note: for Thompson-sampling, assume a prior distribution of Beta(1,1) (uniform [0,1]) for the pa- rameters of each arm. This allows you to update the posterior distribution as follows: If your prior is Beta(a,b), then the posterior distribution is Beta(a+1,b) for an observed success and Beta(a,b+1) for an observed failure.
3. Using the parameters learned from the previous question, test the three multi-armed bandit strategies on the first test dataset, Test1.csv, which represents simulated successes or failures, from 100 25-35 year old adults from a similar population to the training data. (I.e. for each test subject, each strategy should only access one of the results in that row of the data, corresponding to the prototype selected for that subject.) Report the sum of rewards that you get from the three strategies. (8 pts)
4. Now instead consider the second dataset, Test2.csv, which is constructed by taking the same 100 simulated test scores from 25-35 year old adults, and appending an additional 100 simulated test scores, this time from 55-75 year old adults. Using the same parameters as in Question 3, test the three strategies on this new dataset. Report the sum of rewards that you get from the three strategies. How is your answer different to the results that you got from the first dataset? Explain these differences. (10 pts)
5. Which drug prototype would you recommend further preparation and testing to market? (5 pts)

Get in Touch with Our Experts

QQ
WeChat
Whatsapp
MGSC 416代写,Data-driven Models for Operations Analytics代写,Multi-armed Bandit代写,MGSC 416代编,Data-driven Models for Operations Analytics代编,Multi-armed Bandit代编,MGSC 416代考,Data-driven Models for Operations Analytics代考,Multi-armed Bandit代考,MGSC 416help,Data-driven Models for Operations Analyticshelp,Multi-armed Bandithelp,MGSC 416作业代写,Data-driven Models for Operations Analytics作业代写,Multi-armed Bandit作业代写,MGSC 416编程代写,Data-driven Models for Operations Analytics编程代写,Multi-armed Bandit编程代写,MGSC 416programming help,Data-driven Models for Operations Analyticsprogramming help,Multi-armed Banditprogramming help,MGSC 416assignment help,Data-driven Models for Operations Analyticsassignment help,Multi-armed Banditassignment help,MGSC 416solution,Data-driven Models for Operations Analyticssolution,Multi-armed Banditsolution,