Scenario:
Pastas R Us, Inc. is a fast-casual restaurant chain specializing in noodle-based dishes, soups, and salads. Since its inception, the business development team has favored opening new restaurants in areas (within a 3-mile radius) that satisfy the following demographic conditions:
- Median age between 25 – 45 years old
- Household median income above national average
- At least 15% college educated adult population
Last year, the marketing department rolled out a Loyalty Card strategy to increase sales. Under this program, customers present their Loyalty Card when paying for their orders and receive some free food after making 10 purchases.
The company has collected data from its 74 restaurants to track important variables such as average sales per customer, year-on-year sales growth, sales per sq. ft., Loyalty Card usage as a percentage of sales, and others. A key metric of financial performance in the restaurant industry is annual sales per sq. ft. For example, if a 1200 sq. ft. restaurant recorded $2 million in sales last year, then it sold $1,667 per sq. ft.
Executive management wants to know whether the current expansion criteria can be improved. They want to evaluate the effectiveness of the Loyalty Card marketing strategy and identify feasible, actionable opportunities for improvement. As a member of the analytics department, you’ve been assigned the responsibility of conducting a thorough statistical analysis of the company’s available database to answer executive management’s questions.
Assignment
Review the scenario.
Conduct the following descriptive statistics analyses with Microsoft Excel. Answer the questions below in your Microsoft Excel sheet or in a separate Microsoft Word document:
- Insert a new column in the database that corresponds to “Annual Sales.” Annual Sales is the result of multiplying a restaurant’s “SqFt.” by “Sales/SqFt.”
- Calculate the mean, standard deviation, skew, 5-number summary, and interquartile range (IQR) for each of the variables.
- Create a box-plot for the “Annual Sales” variable. Does it look symmetric? Would you prefer the IQR instead of the standard deviation to describe this variable’s dispersion? Why?
- Create a histogram for the “Sales/SqFt” variable. Is the distribution symmetric? If not, what is the skew? Are there any outliers? If so, which one(s)? What is the “SqFt” area of the outlier(s)? Is the outlier(s) smaller or larger than the average restaurant in the database? What can you conclude from this observation?
- What measure of central tendency is more appropriate to describe “Sales/SqFt”? Why?