MSBA7021 Prescriptive Analytics
Assignment 1
1. La Quinta Motor Inns is a mid-sized hotel chain headquartered in San Antonio, Texas. They are looking to expand to more locations, and know that selecting good sites is crucial to a hotel chain’s success. Of the four major marketing considerations (price, product, promotion, and location), location has been shown to be one of the most important considerations for multi-site firms.
Hotel chain owners who can pick good sites quickly have a distinct competitive advantage, since they are competing against other chains for the same sites. La Quinta used data on 57 existing inn locations to build a linear regression model to predict Profitability, computed as the operating margin, or earnings before interest and taxes divided by total revenue. They tried many independent variables, such as Number of Hotel Rooms in the Vicinity and Age of the Inn. All independent variables were normalized to have mean zero and standard deviation one.
The final regression model is given by:
Profitability = 39.05 − 5.41 × State Population per Inn + 5.86 × Price of the Inn
− 3.09 × Square Root of the Median Income of the Area
+ 1.75 × College Students in the Area
In this problem, we will use this regression model together with integer optimization to select the most profitable sites for La Quinta.
(a) Let us start by understanding the regression equation. (i) According to the regression equation given above, which variables positively affect Profitability? Which variables negatively affect Profitability? Does this intuitively make sense to you? (ii) La Quinta routinely uses this regression equation to predict Profitability, and to screen potential real estate acquisitions. Suppose that La Quinta is looking to expand their locations in California, and has collected data for 16 different potential sites. This data is available in the file SelectingHotels.csv. For each hotel, we have the location, the price, and the value of each of the independent variables used in the regression equation. Using the regression equation and the four normalized independent variables, what is the predicted Profitability of hotel 1? (iii) Now use the regression equation and the data to compute the predicted Profitability for all hotels. Which hotel has the highest predicted Profitability? How about the lowest?
(b) La Quinta has a budget of $10 million to spend on hotels. Suppose we used a “greedy” approach where we selected the most profitable hotels until we ran out of budget. So we would start by buying the hotel we predict to be the most profitable, and then if we had enough budget left, we would buy the hotel we predict to be the second most profitable, and so on.
(i) Describe what we would do with this approach. Which hotels would we purchase? (ii) What would our total predicted Profitability be? (iii) If we are trying to maximize our total predicted Profitability, is this a good approach? How about if we were trying to maximize the average predicted Profitability of the hotels we select?
How about if we had a budget of $20 million instead of $10 million?
(c) Now, let us build an optimization model to select hotels. We are trying to decide whether or not a hotel should be purchased, while making sure we spend no more than our budget of $10 million. Our objective is to maximize the total predicted Profitability. (i) Write out the optimization problem we are trying to solve. Make sure to describe the decision variables, the objective, and the constraints. (ii) Now solve your optimization model. What is the optimal solution? Report the values of the decision variables and the optimal objective function value. (iii) Does the optimal solution make sense intuitively? How does it compare to the greedy solution?
(d) La Quinta thinks that buying too many hotels in one city is probably not a good idea, and would prefer to diversify across as many cities as possible. Add constraint(s) to your model to limit the number of hotels purchased in any city to at most 2. (i) What are the constraints that you need to add to the model? Intuitively, do you expect the new optimal objective function value to be larger, smaller, or the same as before? (ii) Solve the new model. What is the optimal solution? Give the values of the decision variables and the optimal objective function value. How does this compare to the previous solution?
(e) In no more than one paragraph, describe how you would present your results to La Quinta. Do you have any recommendations for them to improve the regression model? How about to improve the optimization model?
2. The Salanter Akiba Riverside (SAR) Academy is a coeducational, private Modern Orthodox Jewish day school located in New York City. Every summer, the SAR Academy must create class assignments for their elementary school students.
Each grade of 80 to 100 students must be divided into different classes. Requests for assignments are made by parents, teachers, and school therapists. These requests include paris of students that should be placed together, pairs of students that should not be placed together, pairs of students that should not be placed together, and requests for students to be placed in classes that better suit their unique or special academic needs. These requests often conflict with each other, and it falls on the administration to prioritize which requests should be fulfilled over others.
Here, we will solve a simplified version of the problem faced by the SAR Academy with 100 students and four classes. The parents or guardians of each of the 100 students are asked to submit preferences for class 1 to class 4. These preferences often depend on the teaching style of the teachers, the teachers older siblings have had in the past, and characteristics of the class. The parents give a ranking of 1 to the class they prefer (their first choice), and a ranking of 2 to their second choice, 3 to their third choice and 4 to their fourth choice. In addition, each student’s parents are asked to request four students. In some cases, the data set shows fewer requests, which is usually due to administrators eliminating some parents’ requests before we got the data. Some parents will request more than four students, which is accepted by the school, since that makes it easier to guarantee at least one request. Some parents will make no requests, which gives even more leeway in forming classes.
This data, as well as the gender of each of the students, is given in the spreadsheet ClassAssignment.csv. In the following questions, students are referred by the StudentNumber. (a) The problem faced by the SAR Academy is to decide which students should be assigned to which classes, to satisfy as many of the parent preferences as possible. Each student must be assigned to exactly one class, and the size of each class should be between 23 and 27.
(i) Formulate this problem as an integer optimization problem. Describe the decision variables, objective function, and constraints. (ii) Solve this optimization problem. What is the optimal solution? Give the values of the decision variables and the optimal objective function value. (iii) How many students will be assigned to their first choice class, according to the parent preferences? What are the number of students who receive the second, third and fourth choice class? (b) After looking at the optimal solution, the SAR Academy decided that they would like to adjust the formulation to better balance the boys-to-girls ratio in the classes. They would like to limit that ratio in each class to be no less than 0.6 but no more than 0.8. (i) What constraint(s) do you need to add to your model to incorporate this adjustment? (ii) Add the necessary constraints and re-solve your model. What is the optimal solution now? How does it compare to the previous solution? (iii) How many students will be assigned to their first choice class, according to the parent preferences? What are the number of students who receive the second, third and fourth choice class? (iv) How many boys and girls are there in each class?
(c) Now, we will add some logical constraints to capture additional preferences of parents, teachers, and school therapists. The following sub-questions are only based on (a). Please add the corresponding constraints and solve the model, and comment on how they are satisfied.
(i) Students 15 and 19, 60 and 68 are two pairs of twins, and the school has a policy that twins must be placed in different classes. What constraint(s) needs to be added to the model to implement this policy? (ii) Students 4, 8, 12, 25, 30, 36, 41, 48, 55, 59, 64, 69, 73, 89, 90 and 96 are all from the same neighborhood. They school would like to put at least 3 students from this neighborhood in each class. What constraint(s) needs to be added to the model to implement this policy? (iii) The school therapists strong recommend that students 1, 4 and 95 are placed in the same class, that student 3 is placed in class 1, that student 40 are placed in class 2, and that students 80 and 99 are placed in class 3. What constraint(s) needs to be added to the model to implement this policy?
(d) As for friend requests, 9 students made no requests for friends, 5 made only one request, 10 made two, 16 made three, 52 made 4 and 8 made 5 requests. The school want to guarantee that each student will be placed with at least one child from a list of up to five children provided by that student.
(i) What constraint(s) do you need to add to your model to incorporate this adjustment? (ii) Add the necessary constraints and re-solve your model. What is the optimal solution now? How does it compare to the previous solution? (iii) How many students will be assigned to their first choice class, according to the parent preferences? What are the number of students who receive the second, third and fourth choice class? (iv) How many students are assigned to the same class with more than one friend? And with how many friends are there?
(e) Add all the constraints involved in (a) to (d), including the constraints related to the class size, the boys-to-girls ratio, twin students, neighborhood considerations, recommendations given by therapists and friend requests from students, and re-solve the model. (i) What is the optimal solution now? How does it compare to the previous solution? (ii) How many students will be assigned to their first choice class, according to the parent preferences? What are the number of students who receive the second, third and fourth choice class?
3. In the second lecture note on “Portfolio Optimization”, we have implicitly assumed that the assets available are all risky; that is, they each have a positive standard deviation. A risk-free asset has a return that is deterministic (that is, known with certainty) and therefore has a zero standard deviation. In other words, a risk-free asset is a pure interest-bearing instrument; its inclusion in a portfolio corresponds to lending or borrowing cash at the risk-free rate. Lending (such as the purchase of a bond) corresponds to the risk-free asset having a positive weight, whereas borrowing corresponds to its having a negative weight.
The inclusion of a risk-free asset in the list of possible assets is necessary to obtain realism. Investors invariably have the opportunity to borrow or lend. It turns out that inclusion of a risk-free asset greatly simplifies the shape of the efficient frontier.
Suppose that in addition to the twenty stocks in the lecture notes, there is a risk-free asset with daily risk-free rate 6 × 10−5 . Using the same data as in StockPrices.csv, answer the following questions. (a) Consider the Markowitz model. Suppose that short selling the stocks is allowed, and that both borrowing and lending the risk-free asset are allowed. Plot the efficient frontier and compare it with the efficient frontier when the risk-free asset is not available. (b) Consider the mean-absolute deviation framework. Suppose that short selling the stocks is not allowed, and only lending the risk-free asset is allowed. Plot the efficient frontier and compare it with the efficient frontier when the risk-free asset is not available.