Data Analysis – PCA


see attached files

Option 1:

Using the file Longitudinal Survey, subset the data to select only those individuals who lived in an urban area. Reduce the dimensionality of the data by converting numerical variables such as age, height, weight, number of years of education, number of siblings, family size, number of weeks employed, self-esteem scale, and income into a smaller set of principal components that retain at least 90% of the information in the original data. After showing your work, summarize your findings in a paragraph containing no more than five sentences. It should be clear how PCA improved one’s ability to interpret the data. Note: you will need to standardize the data prior to PCA as the scales of the variables are different. 

Show your workings and Summarize the findings.

Option 2: 

Using the file House Price, choose one of the college towns in the data set. Reduce the dimensionality of the data by converting numerical variables such as number of bedrooms, number of bathrooms, home square footage, lot square footage, and age of the house into a smaller set of the principal components that retain at least 90% of the information in the original data. Then use the principal components as predictor variables for building two de novo models for predicting sale prices of houses. Summarize each of the two models in two or three sentences each.  

Show your workings and develop models of home prices.