Chapter 10 Data Mining
Instructions: Please submit your work in one single Excel file with one tab/worksheet for each problem.
Cluster Analysis
- (25 points) Apply single linkage cluster analysis to Berkeley, Cal Tech, UCLA, and UNC in the Excel file Colleges and Universities Cluster Analysis Worksheet and draw a dendrogram illustrating the clustering process.
Classification
- In the Excel file Credit Risk Data, classify the following record:
- (25 points) Using k-NN algorithm for k=1 to 5.
- (25 points) Using discriminant analysis.
Association
- (25 points) The Excel file Automobile Options provides data on options ordered together for a particular model of automobile. Consider the following rules:
- Rule 1: If Fastest Engine, then Traction Control.
- Rule 2: If Faster Engine and 16-inch Wheels, then 3 Year Warranty.
Compute the support, confidence, and lift for each of these rules.