正在加载图片...
WIREs Computational Statistics Principal component analysis TABLE 6An (Artificial)Example of PCA using a Centered and Normalized Matrix For For Hedonic Meat Dessert Price Sugar Alcohol Acidity Wine 1 14 > 8 13 > Wine 2 0 6 14 > Wine 3 8 5 5 10 12 Wine 4 2 7 16 7 11 Wine 5 6 2 ⊙ 10 Five wines are described by seven variables [data from Ref 44]. factors(whose eigenvalues are clearly larger than one) ■Wine2 PC2 and the noise(represented by factors with eigenvalues Wine5■ clearly smaller than one),then the rotation is likely to provide a solution that is more reliable than the ■Wine3 original solution.However,if this model does not PC accurately represent the data,then rotation will make the solution less replicable and potentially harder to ■Wine1 interpret because the mathematical properties of PCA have been lost. Wine 4 FIGURE 5 PCA wine characteristics.Factor scores of the EXAMPLES observations plotted on the first two components.=4.76, T1=68%;2=1.81,t2=26%. Correlation PCA Suppose that we have five wines described by the average ratings of a set of experts on their hedonic We can see from Figure 5 that the first com- ponent separates Wines 1 and 2 from Wines 4 and dimension,how much the wine goes with dessert,and how much the wine goes with meat.Each wine is also 5,while the second component separates Wines 2 and 5 from Wines 1 and 4.The examination of the described by its price,its sugar and alcohol content, and its acidity.The data [from Refs 40,44]are given values of the contributions and cosines,shown in Table 7,complements and refines this interpretation in Table 6. A PCA of this table extracts four factors(with because the contributions suggest that Component 1 eigenvalues of 4.76,1.81,0.35,and 0.07,respec- essentially contrasts Wines 1 and 2 with Wine 5 and tively).Only two components have an eigenvalue that Component 2 essentially contrasts Wines 2 and 5 with Wine 4.The cosines show that Component larger than 1 and,together,these two components account for 94%of the inertia.The factor scores for 1 contributes highly to Wines 1 and 5,while Component 2 contributes most to Wine 4. the first two components are given in Table 7 and the corresponding map is displayed in Figure 5. To find the variables that account for these differences,we examine the loadings of the variables on the first two components(see Table 8)and the circle TABLE 7 PCA Wine Characteristics Factor scores,contributions of correlations(see Figure 6 and Table 9).From these, of the observations to the components,and squared cosines of the we see that the first component contrasts price with observations on principal components 1 and 2. the wine's hedonic qualities,its acidity,its amount F2 ctr1 ctr2 cos cosZ of alcohol,and how well it goes with meat (i.e.,the wine tasters preferred inexpensive wines).The second Wine1-1.17 -0.55291777 17 component contrasts the wine's hedonic qualities, Wine 2 -1.04 0.6123 21 69 24 acidity,and alcohol content with its sugar content and Wine 3 0.08 0.1902 7 34 how well it goes with dessert.From this,it appears Wine 4 0.89 -0.8617 41 50 46 that the first component represents characteristics that Wine 5 1.23 0.6132 20 78 19 are inversely correlated with a wine's price while the second component represents the wine's sweetness. To strengthen the interpretation,we can apply cosines and contributions have been multiplied by 100 and rounded. a varimax rotation,which gives a clockwise rotation Volume 2,July/August 2010 2010 John Wiley Sons,Inc. 443WIREs Computational Statistics Principal component analysis TABLE 6 An (Artificial) Example of PCA using a Centered and Normalized Matrix For For Hedonic Meat Dessert Price Sugar Alcohol Acidity Wine 1 14 7 8 7 7 13 7 Wine 2 10 7 6 4 3 14 7 Wine 3 8 5 5 10 5 12 5 Wine 4 2 4 7 16 7 11 3 Wine 5 6 2 4 13 3 10 3 Five wines are described by seven variables [data from Ref 44]. factors (whose eigenvalues are clearly larger than one) and the noise (represented by factors with eigenvalues clearly smaller than one), then the rotation is likely to provide a solution that is more reliable than the original solution. However, if this model does not accurately represent the data, then rotation will make the solution less replicable and potentially harder to interpret because the mathematical properties of PCA have been lost. EXAMPLES Correlation PCA Suppose that we have five wines described by the average ratings of a set of experts on their hedonic dimension, how much the wine goes with dessert, and how much the wine goes with meat. Each wine is also described by its price, its sugar and alcohol content, and its acidity. The data [from Refs 40,44] are given in Table 6. A PCA of this table extracts four factors (with eigenvalues of 4.76, 1.81, 0.35, and 0.07, respec￾tively). Only two components have an eigenvalue larger than 1 and, together, these two components account for 94% of the inertia. The factor scores for the first two components are given in Table 7 and the corresponding map is displayed in Figure 5. TABLE 7 PCA Wine Characteristics Factor scores, contributions of the observations to the components, and squared cosines of the observations on principal components 1 and 2. F1 F2 ctr1 ctr2 cos2 1 cos2 2 Wine 1 −1.17 −0.55 29 17 77 17 Wine 2 −1.04 0.61 23 21 69 24 Wine 3 0.08 0.19 0 2 7 34 Wine 4 0.89 −0.86 17 41 50 46 Wine 5 1.23 0.61 32 20 78 19 The positive important contributions are italicized, and the negative important contributions are represented in bold. For convenience, squared cosines and contributions have been multiplied by 100 and rounded. Wine 1 Wine 2 Wine 3 Wine 4 Wine 5 PC1 PC2 FIGURE 5 | PCA wine characteristics. Factor scores of the observations plotted on the first two components. λ1 = 4.76, τ 1 = 68%; λ2 = 1.81, τ 2 = 26%. We can see from Figure 5 that the first com￾ponent separates Wines 1 and 2 from Wines 4 and 5, while the second component separates Wines 2 and 5 from Wines 1 and 4. The examination of the values of the contributions and cosines, shown in Table 7, complements and refines this interpretation because the contributions suggest that Component 1 essentially contrasts Wines 1 and 2 with Wine 5 and that Component 2 essentially contrasts Wines 2 and 5 with Wine 4. The cosines show that Component 1 contributes highly to Wines 1 and 5, while Component 2 contributes most to Wine 4. To find the variables that account for these differences, we examine the loadings of the variables on the first two components (see Table 8) and the circle of correlations (see Figure 6 and Table 9). From these, we see that the first component contrasts price with the wine’s hedonic qualities, its acidity, its amount of alcohol, and how well it goes with meat (i.e., the wine tasters preferred inexpensive wines). The second component contrasts the wine’s hedonic qualities, acidity, and alcohol content with its sugar content and how well it goes with dessert. From this, it appears that the first component represents characteristics that are inversely correlated with a wine’s price while the second component represents the wine’s sweetness. To strengthen the interpretation, we can apply a varimax rotation, which gives a clockwise rotation Volume 2, July/August 2010  2010 John Wiley & Son s, In c. 443
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有