Paper Assignment of Applied Probability and Statistics in Data Analytics

 

Individual Written Assignment 2

Description of the Assignment:

Refer to what your instructor wants you to do and/or to submit, as detailed below in the Deliverables section. Include the real-world (authentic) situation, type of task to be accomplished, and the sorts of higher-order thinking (analysis, reasoning, critical thinking, etc.) that were requested and/or shared throughout your course thus far.

content-divider.png

Deliverables 

Data Visualization and Statistical Modeling Exercise

Overview

In this assignment, you will use business knowledge, statistical modeling method, and data visualization skills to analyze the sales data from an online store. You can use any statistical tools you like, e.g. Excel, Python, R.

Dataset

The dataset records the sales revenue and its marketing spending on each marketing channel from January 2013 to December 2014. The dataset has 105 rows and 16 columns.

Explanation of some columns:

Holiday: 1 means holiday, 0 means non-holiday.

PROMOTION: the scale of promotion

SP: ad spending or cost that used in each marketing channel, e.g. TV, email, paid search, online display, product search.

IMP: the total number of exposures that the ad is viewed by a visitor, or displayed on a web page in each marketing channel.

Assignment

1. Open the excel file and check the correlation between each ‘IMP'(impression) and ‘SP’.

2. Create visualization to see the distribution of ‘Sales’. Hows the trend of sales over the months?

3. Create visualization between each ‘IMP’ and ‘SP’. Hows the trend of each ‘IMP’ and ‘SP’?

4. Check the segmentation of ‘AVERAGE_PRICE’ and ‘MEDIA_SPEND_of_competitor’ for ‘Holiday’. Does ‘MEDIA_SPEND_of_competitor’ have an impact on ‘AVERAGE_PRICE’? (hint: you can use Excel PivotTable to solve this question)

5. Check the segmentation of ‘Sales’ on each ‘PROMOTION’. Does ‘PROMOTION’ boost ‘Sales’?

6. Try to impute a linear regression model to check the coefficient for ‘IMP’ and ‘SP’. Summarize the output of the linear regression model, e.g. p-value, r square.

7. Optional question: Analyze the contributions to the sales for each marketing channel and calculate the ROI (Return of Investment) of each marketing channel. (hint: research on marketing mix modeling method)

CLOs 1,2,3,5,6

Paper Writing Requirements

Please follow the guidelines shared below

Paper Requirements

Most classes have a 2 paper requirement per course. Please be sure to address the following requirements when completing your papers:

  •  The cover page and reference page/s are not included in the above-stated page requirement. These should be in addition to page requirements.
  •  Papers need to be formatted in proper APA 7th Edition style.
  •  Each paper requires a minimum of at least three outside peer-reviewed sources for your references (unless stated otherwise in the guidance above).

o   Acceptable/credible sources include: Academic journals and books, industry journals,  and the class textbook.   To include additional types of sources, please review the Guidelines for finding and utilizing required references for your paper, shared below.

  •  Using your textbook is highly recommended to demonstrate that you have read the required material and/or are connecting new thoughts to the course text/learnings.