Question for Zeek Only

Complete the following steps:

  1. Open the following training dataset ( ) in your Excel spreadsheet program and try to understand the attributes (dependent & independent variables).
  2. Open the following testing dataset ( ) in your Excel spreadsheet program and try to understand the attributes (dependent & independent variables).
  3. Import both of them into the RapidMiner repository as I did in my video lecture (Predictive Modeling – Linear Regression).
  4. Add them to a new process and rename them as “Training dataset” and “Scoring dataset” so you can tell them apart.
  5. Use a Set Role operator to designate the MPG attribute as the label for the training data.
  6. Add a linear regression operator and apply your model to your scoring dataset.
  7. Run your model.
  8. In the Results perspective, examine your attribute coefficients and the predictions for the cars in your scoring dataset.

Report your results:

  1. Which attributes have the most significant predictive power?
  2. Were any attributes dropped from the dataset as non-predictors? If so, which ones and why do you think they were not useful predictors?
  3. Compare the predicted MPG values to the actual MPG values in the scoring dataset. How close are the predictions?
  4. On average, how far off are your model’s predictions?
  5. What other attributes do you think would help your model better predict fuel efficiency

Complete on Word Doc as well

1st assign

2nd assign