
APPLIED SIXTH EDITION MULTIVARIATE STATISTICAL ANALYSIS RICHARD A. DEAN W. JOHNSON WICHERN
APPLIED SIXTH EDITION MULTIVARIATE STAT~STICAL ANALYS~S RI CHARD A . , ~ . . D E A N W . JOHNSON ·~~· WICHERN

Applied MultivariateStatisticalAnalysis
Applied Multivariate Statistical Analysis

SIXTHEDITIONApplied MultivariateStatistical AnalysisRICHARDA.JOHNSONUniversityofWisconsin--MadisonDEAN W.WICHERNTexasA&MUniversityPEARSONPrenliceHallUpperSaddleRiver,NewJersey07458
SIXTH EDITION Applied Multivariate Statistical Analysis RICHARD A. JOHNSON University of Wisconsin-Madison DEAN W. WICHERN Texas A&M University Upper Saddle River, New Jersey 07458

.brary of Congress Cataloging-in-Publication Datathnson, Richard A.Statistical analysis/Richard A.Johnson.6th ed.Dean W.Winchernp.cm.Includes index.ISBN 0-13-187715-11. Statistical AnalysisIPDataAvailableTxecutiveAcquisitionsEditor:Petra RecterVice President and Editorial Director,Mathematics: Christine HoagrojectManager:Michael BellProduction Editor:Debbie Ryansenior Managing Editor:Linda Mihatov Behrensfanufacturing Buyer:MauraZaldivarAssociate Director of Operations: Alexis Heydr-LongMarketing Manager:Wayne ParkinsMarketing Assistant:Jenniferde LeeuwerkEditorial Assistant/Print Supplements Editor:Joanne WendelkenArtDirector:JayneConteDirector of Creative Service: Paul BelfantiCoverDesigner:Bruce KenselaarArtStudio:Laserswords.2007Pearson Education,Inc.PEARSONPearson Prentice HalloonticsPearson Education,Inc.HallUpper Saddle River, NJ 07458All rights reserved. No part of this book may be reproduced,in any form or by any means.without permission in writing from the publisher.Pearson Prentice Hall is a trademark of Pearson Education, Inc.Printed in the United States of America65432110987ISBN-13:976-0-13-187715-30-13-187715-1ISBN-1O:Pearson Education LTD., LondonPearson Education Australia PTY, Limited, SydneyPearson Education Singapore,Pte.LtdPearson Education NorthAsia Ltd,HongKongPearson Education Canada,Ltd.,TorontoPearson Educacion de Mexico, S.A. de C.V.Pearson Education-Japan, TokyoPearson Education Malaysia,Pte.Ltd
.brary of Congress Cataloging-in-Publication Data mnson, Richard A. Statistical analysis/Richard A. Johnson.-6'" ed. Dean W. Winchern p.cm. Includes index. ISBN 0.13-187715-1 1. Statistical Analysis '":IP Data Available \xecutive Acquisitions Editor: Petra Recter Vice President and Editorial Director, Mathematics: Christine Haag roject Manager: Michael Bell Production Editor: Debbie Ryan .>enior Managing Editor: Lindd Mihatov Behrens ~anufacturing Buyer: Maura Zaldivar Associate Director of Operations: Alexis Heydt-Long Aarketing Manager: Wayne Parkins >darketing Assistant: Jennifer de Leeuwerk &Iitorial Assistant/Print Supplements Editor: Joanne Wendelken \rt Director: Jayne Conte Director of Creative Service: Paul Belfanti .::Over Designer: Bruce Kenselaar 1\rt Studio: Laserswords © 2007 Pearson Education, Inc. Pearson Prentice Hall Pearson Education, Inc. Upper Saddle River, NJ 07458 All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher. Pearson Prentice HaHn< is a tradeq~ark of Pearson Education, Inc. Printed in the United States of America 10 9 8 7 6 5 4 3 2 1 ISBN-13: ISBN-10: 978-0-13-187715-3 0-13-187715-1 Pearson Education LID., London Pearson Education Australia P'IY, Limited, Sydney Pearson Education Singapore, Pte. Ltd Pearson Education North Asia Ltd, Hong Kong Pearson Education Canada, Ltd., Toronto Pearson Educaci6n de Mexico, S.A. de C.V. Pearson Education-Japan, Tokyo Pearson Education Malaysia, Pte. Ltd

Tothememory of mymotherand myfatherR. A. J.ToDorothy,Michael,andAndrewD. W.W
To the memory of my mother and my father. R.A.J. To Doroth~ Michael, and Andrew. D.WW

ContentsPREFACEXV11ASPECTSOFMULTIVARIATEANALYSIS1.1Introduction11.2Applications of MultivariateTechniques31.3TheOrganizationof Data5Arrays,5Descriptive Statistics,6Graphical Techniques, 111.4DataDisplaysandPictorialRepresentations19LinkingMultipleTwo-DimensionalScatterPlots20Graphs of Growth Curves, 24Stars, 26ChermoffFaces,271.5Distance301.6Final Comments37Exercises37References47492MATRIXALGEBRAANDRANDOMVECTORS2.1Introduction 492.2SomeBasicsof Matrixand VectorAlgebra49Vectors, 49Matrices542.3PositiveDefinite Matrices602.4ASquare-RootMatrix652.5RandomVectorsandMatrices662.6MeanVectors and CovarianceMatrices68Partitioning theCovariance Matrix, 73The Mean Vectorand Covariance MatrixforLinearCombinationsofRandomVariables,75Partitioning the Sample Mean Vectorand Covariance Matrix,772.7MatrixInequalitiesand Maximization78vii
Contents PREFACE 1 ASPECTS OF MULTIVARIATE ANALYSIS 1.1 Introduction 1 1.2 Applications of Multivariate Techniques 3 1.3 The Organization of Data 5 Arrays,5 Descriptive Statistics, 6 Graphical Techniques, 1J 1.4 Data Displays and Pictorial Representations 19 Linking Multiple Two-Dimensional Scatter Plots, 20 Graphs of Growth Curves, 24 Stars, 26 Chernoff Faces, 27 1.5 Distance 30 1.6 Final Comments 37 Exercises 37 References 47 2 MATRIX ALGEBRA AND RANDOM VECTORS 2.1 Introduction 49 2.2 Some Basics of Matrix and Vector Algebra 49 Vectors, 49 Matrices, 54 2.3 Positive Definite Matrices 60 2.4 A Square-Root Matrix 65 2.5 Random Vectors and Matrices 66 2.6 Mean Vectors and Covariance Matrices 68 Partitioning the Covariance Matrix, 73 The Mean Vector and Covariance Matrix for Linear Combinations of Random Variables, 75 Partitioning the Sample Mean Vector and Covariance Matrix, 77 2.7 Matrix Inequalities and Maximization 78 XV 1 49 vii

viiiContentsSupplement2A:Vectors and Matrices:Basic ConceptsS82Vectors82Matrices,87103Exercises110References1113SAMPLEGEOMETRYANDRANDOMSAMPLING3.1Introduction1113.2TheGeometryofthe Sample1113.3Random SamplesandtheExpected Values ofthe Sample MeanandCovarianceMatrix1193.4GeneralizedVariance 123Situations in which the Generalized Sample Variance Is Zero,129Generalized Variance Determined by|Rand Its Geometrical Interpretation,I34Another Generalization of Variance,1373.5SampleMean,Covariance,andCorrelationAsMatrixOperations1373.6Sample Values of Linear Combinations of Variables140Exercises144References 148THEMULTIVARIATENORMALDISTRIBUTION1494Introduction 1494.14.2The MultivariateNormal Density and Its Properties149Additional Properties oftheMultivariateNormalDistribution,1564.3SamplingfromaMultivariateNormalDistributionand MaximumLikelihood Estimation168The Multivariate Normal Likelihood, 168MaximumLikelihoodEstimationofμandE,170SufficientStatistics,1734.4The SamplingDistribution of X and S173PropertiesoftheWishartDistribution,1744.5Large-Sample Behavior of Xand s175AssessingtheAssumptionofNormality.1774.6Evaluating the Nomality of the UnivariateMarginalDistributions 177EvaluatingBivariateNormality,I824.7Detecting Outliers and CleaningData187Steps forDetecting Outliers 1894.8TransformationstoNearNormality192TransformingMultivariateObservations,195Exercises200References208
viii Contents Supplement 2A: Vectors and Matrices: Basic Concepts 82 Vectors, 82 Matrices, 87 Exercises 103 References 110 3 SAMPLE GEOMETRY AND RANDOM SAMPLING 3.1 Introduction 111 3.2 The Geometry of the Sample 111 3.3 Random Samples and the Expected Values of the Sample Mean and Covariance Matrix 119 3.4 Generalized Variance 123 Situo.tions in which the Generalized Sample Variance Is Zero, I29 Generalized Variance Determined by I R I and Its Geometrical Interpretation, 134 Another Generalization of Variance, 137 3.5 Sample Mean, Covariance, and Correlation As Matrix Operations 137 3.6 Sample Values of Linear Combinations of Variables 140 Exercises 144 References 148 4 THE MULTIVARIATE NORMAL DISTRIBUTION 4.1 Introduction 149 4.2 The Multivariate Normal Density and Its Properties 149 Additional Properties of the Multivariate Normal Distribution, I 56 4.3 Sampling from a Multivariate Normal Distribution and Maximum Likelihood Estimation 168 The Multivariate Normal Likelihood, I68 Maximum Likelihood Estimation of 1.t and 1:, I70 Sufficient Statistics, I73 4.4 The Sampling Distribution of X and S 173 Propenies of the Wishart Distribution, I74 4.5 Large-Sample Behavior of X and S 175 4.6 Assessing the Assumption of Normality 177 Evaluating the Normality of the Univariate Marginal Distributions, I77 Evaluating Bivariate Normality, I82 4.7 Detecting Outliers and Cleaning Data 187 Steps for Detecting Outliers, I89 4.8 'fiansfonnations to Near Normality 192 Transforming Multivariate Observations, I95 Exercises 200 References 208 111 149

ixContents2105INFERENCESABOUTAMEANVECTOR5.1Introduction 2105.2The Plausibility of μoas a Value fora NormalPopulationMean2105.3Hotelling's T?and Likelihood RatioTests 216General LikelihoodRatioMethod,2195.4ConfidenceRegions and Simultaneous ComparisonsofComponentMeans220Simultaneous ConfidenceStatements,223AComparisonof SimultaneousConfidenceIntervalswithOne-at-a-TimeIntervals,229TheBonferroniMethod of MultipleComparisors,2325.5Large Sample Inferences aboutaPopulation MeanVector 2345.6MultivariateQualityControl Charts239Charts forMonitoringa Sample of Individual MultivariateObservationsforStability,241Control Regions forFuture Individual Observations 247ControlEllipseforFutureObservations,248T2.Chart for Future Observations, 248Control Charts Based onSubsampleMeans249Control Regions forFutureSubsampleObservations2515.7Inferencesabout MeanVectorswhenSomeObservationsAreMissing2515.8DifficultiesDuetoTimeDependenceinMultivariateObservations256Supplement5A:Simultaneous ConfidenceIntervalsandEllipsesas Shadows of thep-Dimensional Ellipsoids258Exercises261272References2736COMPARISONSOFSEVERALMULTIVARIATEMEANS6.1Introduction2736.2Paired Comparisons andaRepeated Measures Design 273Paired Comparisons,273ARepeated MeasuresDesign for Comparing Treatments,2796.3ComparingMeanVectorsfromTwoPopulations284AssumptionsConcerningtheStructureoftheData,284Further Assumptions When niand n2Are Small,285SimultaneousConfidenceIntervals,288TheTwo-SampleSituationWhen,+E291An ApproximationtotheDistributionof TforNormal PopulationsWhen Sample Sizes Are Not Large,2946.4Comparing Several MultivariatePopulation Means(One-Way Manova)296AssumptionsabouttheStructureoftheDataforOne-WayMANOVA,296
Contents ix 5 INFERENCES ABOUT A MEAN VECTOR 5.1 Introduction 210 5.2 The Plausibility of p.0 as a Value for a Normal Population Mean 210 5.3 Hotelling's T2 and Likelihood Ratio Tests 216 General Likelihood Ratio Method, 219 5.4 Confidence Regions and Simultaneous Comparisons of Component Means 220 Simultaneous Confidence Statements, 223 A Comparison of Simultaneous Confidence Intervals with One-at-a-Time Intervals, 229 The Bonferroni Method of Multiple Comparisons, 232 5.5 Large Sample Inferences about a Population Mean Vector 234 5.6 Multivariate Quality Control Charts 239 Charts for Monitoring a Sample of Individual Multivariate Observations for Stability, 241 Control Regions for Future Individual Observations, 247 Control Ellipse for Future Observations, 248 T2-Chart for Future Observations, 248 Control Chans Based on Subsample Means, 249 Control Regions for Future Subsample Observations, 251 5.7 Inferences about Mean Vectors when Some Observations Are Missing 251 5.8 Difficulties Due to Time Dependence in Multivariate Observations 256 Supplement SA: Simultaneous Confidence Intervals and Ellipses as Shadows of the p-Dimensional Ellipsoids 258 Exercises 261 References 272 6 COMPARISONS OF SEVERAL MULTIVARIATE MEANS 6.1 Introduction 273 6.2 Paired Comparisons and a Repeated Measures Design 273 Paired Comparisons, 273 A Repeated Measures Design for Comparing ]}eatments, 279 6.3 Comparing Mean Vectors from Two Populations 284 Assumptions Concerning the Structure of the Data, 284 Funher Assumptions When n1 and n2 Are Small, 285 Simultaneous Confidence Intervals, 288 The Two-Sample Situation When 1:1 * !.2, 291 An Approximation to the Distribution of T2 for Normal Populations When Sample Sizes Are Not Large, 294 6.4 Comparing Several Multivariate Population Means (One-Way Manova) 296 Assumptions about the Structure of the Data for One-Way MAN OVA, 296 210 273

ContentsASummaryofUnivariateANOVA,297MultivariateAnalysisofVariance (MANOVA),3016.5Simultaneous Confidence Intervalsfor TreatmentEffects3086.6Testingfor Equality of CovarianceMatrices 3106.7Two-Way Multivariate Analysis of Variance 312Univariate Two-WayFixed-Effects Model with Inieraction,312Multivariate Two-WayFixed-Effects Model with Interaction,3156.8ProfileAnalysis3236.9Repeated Measures Designs and Growth Curves3286.10Perspectives and a Strategy for AnalyzingMultivariateModels332Exercises337References 358MULTIVARIATELINEARREGRESSIONMODELS1360Introduction3607.17.2The Classical Linear Regression Model 3607.3LeastSquaresEstimation 364Sum-of-SquaresDecomposition,366GeometryofLeastSquares,367SamplingPropertiesofClassicalLeastSquaresEstimnators,3697.4InferencesAbouttheRegressionModcl 370InferencesConcerning theRegressionParameters,370Likelihood RatioTests fortheRegressionParameters3747.5Inferences from the Estimated Regression Function378Estimating the RegressionFunctionat zo,378ForecastingaNewObservationat Zo,3797.6Model Checking and Other Aspects of Regression381Does the Model Fit?,381Leverage and Influence, 384AdditionalProblemsinLinearRegression,3847.7Multivariate MultipleRegression387Likelihood RatioTests for Regression Parameters395OtherMultivariateTestStatistics398Predictions fromMultivariateMultiple Regressiong3997.8TheConceptof LinearRegression 401Prediction of Several Variables, 406Partial Correlation Coefficient,4097.9Comparing the TwoFormulations of the Regression Model 410Mean Corrected Formofthe Regression Model, 410Relating theFormulations,4127.10Multiple Regression Models with Time DependentErrors413Supplement 7A:The Distribution of the Likelihood Ratiofor the Multivariate Multiple Regression Model418Exercises-420References428
Contents A Summary of Univariate AN OVA, 297 Multivariate Analysis ofVariance (MANOVA), 30I 6.5 Simultaneous Confidence Intervals for Treatment Effects 308 6.6 Testing for Equality of Covariance Matrices 310 6.7 1\vo-Way Multivariate Analysis of Variance 312 Univariate Two-Way Fixed-Effects Model with Interaction, 312 Multivariate 1Wo-Way Fixed-Effects Model with Interaction, 3I5 6.8 Profile Analysis 323 6.9 Repeated Measures Designs and Growth Curves 328 6.10 Perspectives and a Strategy for Analyzing Multivariate Models 332 Exercises 337 References 358 7 MULTIVARIATE LINEAR REGRESSION MODELS 7.1 Introduction 360 7.2 The Classical Linear Regression Model 360 7.3 Least Squares Estimation 364 Sum-of-Squares Decomposition, 366 Geometry of Least Squares, 367 Sampling Properties of Classical Least Squares Estimators, 369 7.4 Inferences About the Regression Model 370 Inferences Concerning the Regression Parameters, 370 Likelihood Ratio Tests for the Regression Parameters, 374 7.5 Inferences from the Estimated Regression Function 378 Estimating the Regression Function atz0 , 378 Forecasting a New Observation at z0 , 379 7.6 Model Checking and Other Aspects of Regression 381 Does the Model Fit?, 38I Leverage and Influence, 384 Additional Problems in Linear Regression, 384 7.7 Multivariate Multiple Regression 387 Likelihood Ratio Tests for Regression Parameters, 395 Other Multivariate Test Statistics, 398 Predictions from Multivariate Multiple Regressions, 399 7.8 The Concept of Linear Regression 401 Prediction of Several Variables, 406 Partial Correlation Coefficient, 409 7.9 Comparing the 1\vo Formulations of the Regression Model 410 Mean Corrected Form of the Regression Model, 4IO Relating the Formulations, 412 7.10 Multiple Regression Models with Time Dependent Errors 413 Supplement 7 A: The Distribution of the Likelihood Ratio 360 for the Multivariate Multiple Regression Model 418 Exercises- 420 References 428

xiContents14308PRINCIPALCOMPONENTS8.1Introduction4308.2Population Principal Components430PrincipalComponentsObtained fromStandardizedVariables,436Principal Components for Covariance MatriceswithSpecial Structures4398.3Summarizing Sample Variation by Principal Components441TheNumberofPrincipalComponents,444InterpretationoftheSamplePrincipalComponents,448StandardizingtheSamplePrincipalComponents,4498.4Graphing the Principal Components4548.5Large Sample Inferences456Large SamplePropertiesof A,and er,456TestingfortheEqualCorrelationStructure4578.6MonitoringQualitywithPrincipal Components459Checkinga Given Set of Measurements for Stability,459Controlling Future Values,463Supplement8A:TheGeometryof theSamplePrincipalComponentApproximation466The p-Dimensional Geometrical Interpretation, 468The n-Dimensional Geometrical Interpretation,469Exercises470480References9FACTORANALYSISANDINFERENCE481FORSTRUCTUREDCOVARIANCEMATRICES9.1Introduction4819.2482The Orthogonal FactorModel9.3Methodsof Estimation488ThePrincipal Component (andPrincipalFactor)Method,488AModifiedApproach-thePrincipalFactorSolution,494The Maximum Likelihood Method, 495ALargeSampleTestfortheNumberofCommonFactors5019.4504FactorRotation.ObliqueRotations,5129.5FactorScores513The Weighted Leasr Squares Method, 514The Regression Method, 5169.6Perspectives and a StrategyforFactorAnalysiss519Supplement 9A:SomeComputationalDetailsfor Maximum Likelihood Estimation527Recommended Computational Scheme,528MaximumLikelihoodEstimatorsof p-L,L,+,529Exercises530References 538
Contents xi 8 PRINCIPAL COMPONENTS 8.1 Introduction 430 8.2 Population Principal Components 430 Principal Components Obtained from Standardized Variables, 436 Principal Components for Covariance Matrices with Special Structures, 439 8.3 Summarizing Sample Variation by Principal Components 441 The Number of Principal Components, 444 Interpretation of the Sample Principal Components, 448 Standardizing the Sample Principal Components, 449 8.4 Graphing the Principal Components 454 8.5 Large Sample Inferences 456 Large Sample Propenies of A; and e;, 456 Testing for the Equal Correlation Structure, 457 8.6 Monitoring Quality with Principal Components 459 Checking a Given Set of Measurements for Stability, 459 Controlling Future Values, 463 Supplement 8A: The Geometry of the Sample Principal Component Approximation 466 The p-Dimensional Geometrical Interpretation, 468 Then-Dimensional Geometrical Interpretation, 469 Exercises 470 References 480 9 FACTOR ANALYSIS AND INFERENCE FOR STRUCTURED COVARIANCE MATRICES 9.1 Introduction 481 9.2 The Orthogonal Factor Model 482 9.3 Methods of Estimation 488 The Principal Component (and Principal Factor) Method, 488 A Modified Approach-the Principal Factor Solution, 494 The Maximum Likelihood Method, 495 A Large Sample Test for the Number of Common Factors, 501 9.4 Factor Rotation 504 Oblique Rotations, 512 9.5 Factor Scores 513 The Weighted Least Squares Method, 514 The Regression Method, 516 9.6 Perspectives and a Strategy for Factor Analysis 519 Supplement 9A: Some Computational Details for Maximum Likelihood Estimation 52 7 Recommended Computational Scheme, 528 Maximum Likelihood Estimators of p = L,L~ + 1/1, 529 Exercises 530 References 538 430 481