MAT125 Project 2 Regression Analysis
You may collect your own data, find a data set online (ESPN, etc), https://dasl.datadescription.com/ (type in regression) or you can also use the data set which in the folder for project #2. The data set must have two quantitative variables and have at least 25 data values. If you are using the data set in the project #2 folder you must use all 50 values for each variable.
The printouts to hand in are: (Make sure you know which variable is the explanatory variable and which variable is the response variable before you start the analysis.)
1. The data set itself.
Data> Display Data
Select the data values
2. Scatterplot
Graph> Scatterplot> Simple
Select the response variable for the y-variable Select the explanatory variable for the x-variable
3. Fitted straight line
Stat> Regression> Fitted line plot
Select the response variable (y)
Select the explanatory (predictor) variable (x)
4. Residual plot of residuals versus fits.
Stat > Regression > Regression > Fit Regression Model
Put in Responses (the dependent variable, y) and Continuous Predictors (the independent variable, x) In dialog box click on graphs, then check the box for residuals versus fits.
You need the graph as well as the Regression Analysis in the session window.
5. Regression Analysis Statistics along with statistics for each individual observation
Stat> Regression> Regression > Fit Regression Model
In dialog box click on results, where it says fits and diagnostics change the drop down menu to For all observations.
You only need the Regression Analysis in the session window.
From the printouts, answer the following questions on a separate piece of paper:
1. A description of your data set. i.e where you got it, how you collected it, why do you think there might be a
linear relationship between the two variables? Include the URL.
2. What is your explanatory variable and what is your response variable?
3. What is the least squares regression equation?
4. Give a predicted value of y for a value of the explanatory variable which is not an actual observation in your data set.
5. What is the coefficient of determination, r2. Explain the meaning of r2 in relation to your analysis.
6. What is the correlation coefficient, r? Interpret r.
7. Based on the residual plot, does there seem to be a linear relationship between the two variables you are analyzing? Explain.
8. For your tenth observation, what is the residual?
9. Which observations are outliers and/or influential observations?
If there are none, state that there are no outliers and/or influential observations.
For each outlier, give the value of the explanatory variable, the value of the observed response variable and the standardized residual score.
For each influential observation, give the value of the explanatory variable and the observed response variable.
(Note: You need to look back at your data to find the explanatory variable for the observation that is an outlier or influential observation.)