Paper Assignment of Applied Probability and Statistics in Data Analytics

 

Description of the Assignment:

Refer to what your instructor wants you to do and/or to submit, as detailed below in the Deliverables section. Include the real-world (authentic) situation, type of task to be accomplished, and the sorts of higher-order thinking (analysis, reasoning, critical thinking, etc.) that were requested and/or shared throughout your course thus far.

content-divider.png

Deliverables 

Data Visualization and Statistical Modeling Exercise

Overview

In this assignment, you will use business knowledge, statistical modeling method, and data visualization skills to analyze the sales data from an online store. You can use any statistical tools you like, e.g. Excel, Python, R.

Dataset

The dataset records the sales revenue and its marketing spending on each marketing channel from January 2013 to December 2014. The dataset has 105 rows and 16 columns.

Explanation of some columns:

Holiday: 1 means holiday, 0 means non-holiday.

PROMOTION: the scale of promotion

SP: ad spending or cost that used in each marketing channel, e.g. TV, email, paid search, online display, product search.

IMP: the total number of exposures that the ad is viewed by a visitor, or displayed on a web page in each marketing channel.

Assignment

1. Open the excel file and check the correlation between each ‘IMP'(impression) and ‘SP’.

2. Create visualization to see the distribution of ‘Sales’. Hows the trend of sales over the months?

3. Create visualization between each ‘IMP’ and ‘SP’. Hows the trend of each ‘IMP’ and ‘SP’?

4. Check the segmentation of ‘AVERAGE_PRICE’ and ‘MEDIA_SPEND_of_competitor’ for ‘Holiday’. Does ‘MEDIA_SPEND_of_competitor’ have an impact on ‘AVERAGE_PRICE’? (hint: you can use Excel PivotTable to solve this question)

5. Check the segmentation of ‘Sales’ on each ‘PROMOTION’. Does ‘PROMOTION’ boost ‘Sales’?

6. Try to impute a linear regression model to check the coefficient for ‘IMP’ and ‘SP’. Summarize the output of the linear regression model, e.g. p-value, r square.

7. Optional question: Analyze the contributions to the sales for each marketing channel and calculate the ROI (Return of Investment) of each marketing channel. (hint: research on marketing mix modeling method)

5 pages and 1 cover