Operation Management Case/ Statistics


Instruction of Lab 8
EBTM 350, Fall 2022, Professor: Xiaorui Zhu, Ph.D.
Purpose: I hope that by doing this lab, you will get a taste of how to prepare your final project, as well
as (1) practice your R coding; (2) get an idea of the entire process (Exploratory Data Analysis,
regression modeling, data mining, interpreting results, etc.); and (3) realize that data
understanding and result interpretation are not trivial tasks, and they are extremely important in
business projects. Submit a Word file with all the following sections and results well organized.
Requirements:
1. (5 points) Write an interesting story in the first section (Background or Introduction). In my class case, we
want to forecast the median home value in the Boston area (or the wine-tasting preference), but I would
refrain from offering you a specific context. I require you to think about in what situations, you may need to
forecast the median home price for your business/clients and write down your story.
a. For example, you can pretend you are a real estate buyer agent. Your decision is to give a
recommendation for your client (buyer) to make a reasonable offer for the house they are interested in.
So, you want to predict the median house price in a certain area with attributes so that your clients can
use it as a reference to make the offer.
b. Another simple example: From an analyst’s perspective, I want to understand how these predictors are
associated with the median home price. What are those major contributors? So, when I want to buy a
house, I should focus on these important attributes because they highly affect the market value of the
house.
c. Yours.
2. (10 points) Exploratory Analyses & Visualization.
a. (2 pts) Explore the data (use plots, summary statistics, etc.) and provide some findings and your
interpretation.
b. (2 pts) Explore the response variable: median house value (use visualization tools, check outliers, etc.).
c. (2 pts) Explore some of the predictors that you believe are important (use visualization tools, check
outliers, etc.).
d. (4 pts) Explore the associations between predictors and your response variable visually and
quantitatively.
3. (20 points) Modeling and Interpretation of your results.
a. (6 pts) Use multiple linear regression models we just learned. You may need to explore models with
different predictors and find the best one.
b. (6 pts) Use at least one variable selection method we learned this week (Best subset selection, Forward,
backward, stepwise selection) to find the best model.
c. (8 pts) Interpret the fitted model from multiple perspectives: find these significant predictors by t-test (2
pts), interpret the coefficient estimates (4 pts), and check whether the whole model is significant (Ftest) (2 pts).
4. (5 points) Summary or Conclusion:
a. (2 pts) Compare all the models you obtained (Use information criteria AIC/BIC, and goodness-of-fit R2
to
support your conclusion).
b. (2 pts) Provide a suggestion of the best model.
c. (1 pts) Conclude your findings: what predictors are useful, what is the performance of your model etc.