Mathematics – Statistics Data Assignment


  1. Abstract: A brief statement (few sentences at most) summarizing the purpose of the report as well as the results and what they mean in substantive rather than statistical terms. Be brief and to the point, stimulating your reader’s interest. You essentially have 15 seconds to let the reader know what’s in the report and if they should read it. 

2. Introduction: Give background and motivate the question to be investigated. Introduce the data, perhaps with visualization or descriptive statistics, but only if you think it significantly adds to the narrative. A reader could jump from here to Results. While a formal research study has explicit hypotheses, you likely won’t have any, but at least try to suggest which variables in what form you expect to be important predictors. 

3. Methodology: Describe the approach used to analyze the problem while keeping in mind that your reader likely never took or doesn’t remember ST 625. What was your approach? Why is your approach is appropriate? What are the assumptions? Are there any concerns about these assumptions? 

4. Results: Present the recommendation simply and clearly. Use graphical display, table, and discussion as you see appropriate. You are providing an answer to the question outlined in the introduction. 

5. Conclusion: Briefly summarize everything. Does the model make sense? Do the predictors seem reasonable? What does it all mean? Suggestions for further analysis or other data might be appropriate.

 6. Appendix: Screenshots of recommended R model 

Using the range of models, tools, and techniques studied in ST 625, build and present two models: one for predicting casual bike rentals using the independent variables, and another for predicting registered bike rentals. Your ultimate goal is to explain the factors that influence bike demand, and as such your model/variable selection should be based both on context and statistics. Model interpretation will be very important part of your report! What influences demand for bikes, and how, and to the extent plausible why? You should at a minimum perform linear regression using all the available independent variables as well as consider some types of complex model (terms that are higher-order, interaction, dummy), then perform variable selection/compare models. And to be very clear, casual should not be a variable used to predict registered, nor vice-versa