正在加载图片...
in turn and contrasting the resulting residual sum of squares with that for the full model produces so-called Type-III tests;adding terms to the model sequentially produces so-called Type-I tests;and testing each term after all terms in the model with the exception of those to which it is marginal produces so-called Type-II tests.Closely analogous multivariate analysis-of-variable (MANOVA)tables can be formed similarly by taking differences in error sum of squares and products matrices. In some contexts-for example.when the response variables represent repeated measures of the same variable over time-it is also of interest to entertain a design and hypotheses on the response (see,e.g., O'Brien and Kaiser,1985).Such tests can be formulated by extending the linear hypothesis in Equation 2 to Ho:LBP=0 where the m x k matrix P provides contrasts in the responses. 3 Data Ellipses and Ellipsoids The data ellipse,described by Dempster(1969)and Monette(1990),is a device for visualizing the relationship between two variables,Yi and Y2.Let D(y)=(y-y)TS-1(y-y)represent the squared Mahalanobis distance of the point y=(,y2)T from the centroid of the data y=(Y1,Y2)T.The data ellipse Ee of size c is the set of all points y with Di(y)less than or equal to c2: (y:s)={y:(y-)Ts-1(y-)≤2} (3) Here.S is the sample covariance matrix. s=-y-可 n-1 Selecting c=1 produces the "standard"data ellipse,as illustrated in Figure 1:The perpendicular "shadows"of the ellipse on the axes mark off twice the standard deviation of each variable;the regression line for Y2 on Yi intersects the points of vertical tangency on the boundary of the ellipse;and the correlation between the two variables is proportional to the length of the line from the bottom of the ellipse to the point of vertical tangency at the right.Many other properties of correlation and regression can be visualized using the data ellipse (see,e.g.,Monette,1990). These properties of the data ellipse hold regardless of the joint distribution of the variables,but if the variables are bivariate normal,then the data ellipse represents a contour of constant density in their joint distribution.In this case,D(y)has a large-sample x2 distribution with 2 degrees of freedom,and so,for example,taking c2=x2(0.95)=5.996 encloses approximately 95 percent of the data.Alternatively,in small samples,we can take 2=2m-Fm-2≈2n2m-2 n-2 but this typically makes little difference visually. The generalization of the data ellipse to more than two variables is immediate:Applying Equation 3 to y =(42,43)T,for example,produces a data ellipsoid in three dimensions.For m multivariate-normal variables,selecting c2=x(1-a)encloses approximately 100(1-a)percent of the data.Again,for greater precision,we can use 2=ma-Fnn-m≈mEnn-m n-m 4 Implementation of Tests for Multivariate Linear Models in the car Package Tests for multivariate linear models are implemented in the car package as S3 methods for the generic linear.hypothesis and Anova functions,with Manova provided as a synonym for the latter.The Anova function computes partial (so-called "Types II and III")hypothesis tests,as opposed to the anova function in the stats package,which computes sequential ("Type-I")tests;these tests coincide in one-way and balanced designs.Several examples of the use of these functions are given in this section. 3in turn and contrasting the resulting residual sum of squares with that for the full model produces so-called Type-III tests; adding terms to the model sequentially produces so-called Type-I tests; and testing each term after all terms in the model with the exception of those to which it is marginal produces so-called Type-II tests. Closely analogous multivariate analysis-of-variable (MANOVA) tables can be formed similarly by taking differences in error sum of squares and products matrices. In some contexts – for example, when the response variables represent repeated measures of the same variable over time – it is also of interest to entertain a design and hypotheses on the response (see, e.g., O’Brien and Kaiser, 1985). Such tests can be formulated by extending the linear hypothesis in Equation 2 to H0: LBP = 0 where the m × k matrix P provides contrasts in the responses. 3 Data Ellipses and Ellipsoids The data ellipse, described by Dempster (1969) and Monette (1990), is a device for visualizing the relationship between two variables, Y1 and Y2. Let D2 M(y)=(y − y)T S−1(y − y) represent the squared Mahalanobis distance of the point y = (y1, y2)T from the centroid of the data y = (Y 1, Y 2)T . The data ellipse Ec of size c is the set of all points y with D2 M(y) less than or equal to c2: Ec(y; S,y) ≡ © y: (y − y) T S−1(y − y) ≤ c2ª (3) Here, S is the sample covariance matrix, S = Pn i=1(y − y)T (y − y) n − 1 Selecting c = 1 produces the “standard” data ellipse, as illustrated in Figure 1: The perpendicular “shadows” of the ellipse on the axes mark off twice the standard deviation of each variable; the regression line for Y2 on Y1 intersects the points of vertical tangency on the boundary of the ellipse; and the correlation between the two variables is proportional to the length of the line from the bottom of the ellipse to the point of vertical tangency at the right. Many other properties of correlation and regression can be visualized using the data ellipse (see, e.g., Monette, 1990). These properties of the data ellipse hold regardless of the joint distribution of the variables, but if the variables are bivariate normal, then the data ellipse represents a contour of constant density in their joint distribution. In this case, D2 M(y) has a large-sample χ2 distribution with 2 degrees of freedom, and so, for example, taking c2 = χ2 2(0.95) = 5.99 ≈ 6 encloses approximately 95 percent of the data. Alternatively, in small samples, we can take c2 = 2(n − 1) n − 2 F2,n−2 ≈ 2F2,n−2 but this typically makes little difference visually. The generalization of the data ellipse to more than two variables is immediate: Applying Equation 3 to y = (y1, y2, y3)T , for example, produces a data ellipsoid in three dimensions. For m multivariate-normal variables, selecting c2 = χ2 m(1−α) encloses approximately 100(1−α) percent of the data. Again, for greater precision, we can use c2 = m(n − 1) n − m Fm,n−m ≈ mFm,n−m 4 Implementation of Tests for Multivariate Linear Models in the car Package Tests for multivariate linear models are implemented in the car package as S3 methods for the generic linear.hypothesis and Anova functions, with Manova provided as a synonym for the latter. The Anova function computes partial (so-called “Types II and III”) hypothesis tests, as opposed to the anova function in the stats package, which computes sequential (“Type-I”) tests; these tests coincide in one-way and balanced designs. Several examples of the use of these functions are given in this section. 3
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有