Chapter 5 Principal components analysis(PCA
zf Chapter 5 Principal Components Analysis (PCA)
Presentation outline 令◆ What is pca? 令◆ Geometrical approach to PCa 令◆ Analytical approach to PCA 令◆ Properties of pCa 令◆ How to determine the number of pc? 令◆ How to interpret the PC? ◆◆ Use of pc scores 2021/1/21
2021/1/21 2 cxt Presentation Outline ❖ w What is PCA? ❖ w Geometrical approach to PCA ❖ w Analytical approach to PCA ❖ w Properties of PCA ❖ w How to determine the number of PC? ❖ w How to interpret the PC? ❖ w Use of PC scores
5.1 reasons for using principal components analysis 口 Too many variables Pasta ic boa pressure LL Cholesteral Systolic BressE fication Diet L Cholesterol Exercse 2021/1/21 xt
2021/1/21 3 cxt 5.1 reasons for using principal components analysis Too Many Variables
a Stone use 1929-1938 data in usa and receive 17 variables which describe income-pay. He used principle component analysis and got three new variables fl. f2, F3. fl total income F2, total income increase ratio F3, economy increase or decrease. These new variable can use three variables(l △I、t) which can be measured directl 2021/1/21
2021/1/21 4 cxt Stone use 1929一1938 data in USA, and receive 17 variables which describe income-pay. He used principle component analysis and got three new variables F1、F2、F3. F1, total income;F2,total income increase ratio;F3,economy increase or decrease. These new variable can use three variables (I、 I、t )which can be measured directly
FI F2 F3 FI F2 0 F3 0.9950.0410.0571 △10.0560.9480.124-0.1021 369-0.282-0.836-0.414-0.1121 2021/1/21 cXt
2021/1/21 5 cxt F1 F2 F3 i i t F1 1 F2 0 1 F3 0 0 1 i 0.995 -0.041 0.057 l Δi -0.056 0.948 -0.124 -0.102 l t -0.369 -0.282 -0.836 -0.414 -0.112 1
Solutions Eliminate some redundant variables May lose important information that was uniquely reflected in the eliminated variables Create composite scores from variables(sum or average) Lost variability among the variables Multiple scale scores may still be collinear Create weighted linear combinations of variables while retaining most of the variability in the data Fewer variables little or no lost variation No collinear scales 2021/1/21
2021/1/21 6 cxt Solutions Eliminate some redundant variables. – May lose important information that was uniquely reflected in the eliminated variables. Create composite scores from variables (sum or average). – Lost variability among the variables – Multiple scale scores may still be collinear Create weighted linear combinations of variables while retaining most of the variability in the data. – Fewer variables; little or no lost variation – No collinear scales
日 An Easy choice To retain most of the information in the data while reducing the number of variables you must deal with, try principal components anal ysis. Most of the variability in the original data can be retained. but Components may not be directly interpretable 2021/1/21 cXt
2021/1/21 7 cxt An Easy Choice To retain most of the information in the data while reducing the number of variables you must deal with, try principal components analysis. Most of the variability in the original data can be retained. but… Components may not be directly interpretable
令 What is pca?(什么是主成分分析) PCa is a technique for forming new variables which are linear composites of the original variables The new variables are called principal components(PrinS The maximum number of prin's that can be formed is equal to the number of original variables. Usually the first few prin's represent most of the information in the original variables and can replace the original variables and hence achieve data reduction which is the main objective of pCa g The Prins are uncorrelated among themselves and can be used in regression 2021/1/21
2021/1/21 8 cxt ❖ What is PCA?(什么是主成分分析) ❖ PCA is a technique for forming new variables which are linear composites of the original variables. The new variables are called principal components(PRIN’s). ❖ The maximum number of PRIN’s that can be formed is equal to the number of original variables. Usually the first few PRIN’s represent most of the information in the original variables and can replace the original variables and hence achieve data reduction, which is the main objective of PCA ❖ The PRIN’s are uncorrelated among themselves and can be used in regression
a Principal Components AnalysiS(PCA) is a dimension reduction method that creates variables called principal components creates as many components as there are input variables. a Principal Components are weighted linear combinations of input variables aca othogonal to and independent of other components are generated so that the first component accounts for the most variation in the xs followed by the second component, and so on 2021/1/21 cXt
2021/1/21 9 cxt Principal Components Analysis(PCA) is a dimension reduction method that creates variables called principal components creates as many components as there are input variables. Principal Components are weighted linear combinations of input variables are orthogonal to and independent of other components are generated so that the first component accounts for the most variation in the xs, followed by the second component, and so on
平移、转坐标轴 F F 2021/1/21 10 cXt
2021/1/21 10 cxt 平移、旋转坐标轴 • 1 x F2 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • F1 2 x