正在加载图片...
A Tutorial on Principal Component Analysis Jonathon Shlens* Center for Neural Science,New York University New York City.NY 10003-6603 and Systems Neurobiology Laboratory,Salk Insitute for Biological Studies La Jolla,CA 92037 (Dated:April 22,2009;Version 3.01) Principal component analysis (PCA)is a mainstay of modern data analysis-a black box that is widely used but (sometimes)poorly understood.The goal of this paper is to dispel the magic behind this black box.This manuscript focuses on building a solid intuition for how and why principal component analysis works.This manuscript crystallizes this knowledge by deriving from simple intuitions,the mathematics behind PCA.This tutorial does not shy away from explaining the ideas informally,nor does it shy away from the mathematics.The hope is that by addressing both aspects,readers of all levels will be able to gain a better understanding of PCA as well as the when,the how and the why of applying this technique I.INTRODUCTION Il.MOTIVATION:A TOY EXAMPLE Principal component analysis(PCA)is a standard tool in mod- Here is the perspective:we are an experimenter.We are trying ern data analysis-in diverse fields from neuroscience to com- to understand some phenomenon by measuring various quan- puter graphics-because it is a simple,non-parametric method tities (e.g.spectra,voltages,velocities,etc.)in our system. for extracting relevant information from confusing data sets. Unfortunately,we can not figure out what is happening be- With minimal effort PCA provides a roadmap for how to re- cause the data appears clouded,unclear and even redundant duce a complex data set to a lower dimension to reveal the This is not a trivial problem,but rather a fundamental obstacle sometimes hidden,simplified structures that often underlie it. in empirical science.Examples abound from complex sys- The goal of this tutorial is to provide both an intuitive feel for tems such as neuroscience,web indexing,meteorology and oceanography-the number of variables to measure can be PCA,and a thorough discussion of this topic.We will begin with a simple example and provide an intuitive explanation unwieldy and at times even deceptive,because the underlying of the goal of PCA.We will continue by adding mathemati- relationships can often be quite simple cal rigor to place it within the framework of linear algebra to Take for example a simple toy problem from physics dia- provide an explicit solution.We will see how and why PCA grammed in Figure 1.Pretend we are studying the motion is intimately related to the mathematical technique of singular of the physicist's ideal spring.This system consists of a ball value decomposition(SVD).This understanding will lead us of mass m attached to a massless,frictionless spring.The ball to a prescription for how to apply PCA in the real world and an is released a small distance away from equilibrium (i.e.the appreciation for the underlying assumptions.My hope is that spring is stretched).Because the spring is ideal,it oscillates a thorough understanding of PCA provides a foundation for indefinitely along the x-axis about its equilibrium at a set fre- approaching the fields of machine learning and dimensional quency. reduction. This is a standard problem in physics in which the motion The discussion and explanations in this paper are informal in along the x direction is solved by an explicit function of time. the spirit of a tutorial.The goal of this paper is to educate. In other words,the underlying dynamics can be expressed as Occasionally,rigorous mathematical proofs are necessary al- a function of a single variable x. though relegated to the Appendix.Although not as vital to the tutorial,the proofs are presented for the adventurous reader However,being ignorant experimenters we do not know any who desires a more complete understanding of the math.My of this.We do not know which,let alone how many,axes only assumption is that the reader has a working knowledge and dimensions are important to measure.Thus,we decide to of linear algebra.My goal is to provide a thorough discussion measure the ball's position in a three-dimensional space (since by largely building on ideas from linear algebra and avoiding we live in a three dimensional world).Specifically,we place challenging topics in statistics and optimization theory (but three movie cameras around our system of interest.At 120 Hz see Discussion).Please feel free to contact me with any sug- each movie camera records an image indicating a two dimen- gestions,corrections or comments. sional position of the ball(a projection).Unfortunately,be- cause of our ignorance,we do not even know what are the real x,y and z axes,so we choose three camera positions a.b and c at some arbitrary angles with respect to the system.The angles "Electronic address:shlensesalk.edu between our measurements might not even be 90!Now,weA Tutorial on Principal Component Analysis Jonathon Shlens∗ Center for Neural Science, New York University New York City, NY 10003-6603 and Systems Neurobiology Laboratory, Salk Insitute for Biological Studies La Jolla, CA 92037 (Dated: April 22, 2009; Version 3.01) Principal component analysis (PCA) is a mainstay of modern data analysis - a black box that is widely used but (sometimes) poorly understood. The goal of this paper is to dispel the magic behind this black box. This manuscript focuses on building a solid intuition for how and why principal component analysis works. This manuscript crystallizes this knowledge by deriving from simple intuitions, the mathematics behind PCA. This tutorial does not shy away from explaining the ideas informally, nor does it shy away from the mathematics. The hope is that by addressing both aspects, readers of all levels will be able to gain a better understanding of PCA as well as the when, the how and the why of applying this technique. I. INTRODUCTION Principal component analysis (PCA) is a standard tool in mod￾ern data analysis - in diverse fields from neuroscience to com￾puter graphics - because it is a simple, non-parametric method for extracting relevant information from confusing data sets. With minimal effort PCA provides a roadmap for how to re￾duce a complex data set to a lower dimension to reveal the sometimes hidden, simplified structures that often underlie it. The goal of this tutorial is to provide both an intuitive feel for PCA, and a thorough discussion of this topic. We will begin with a simple example and provide an intuitive explanation of the goal of PCA. We will continue by adding mathemati￾cal rigor to place it within the framework of linear algebra to provide an explicit solution. We will see how and why PCA is intimately related to the mathematical technique of singular value decomposition (SVD). This understanding will lead us to a prescription for how to apply PCA in the real world and an appreciation for the underlying assumptions. My hope is that a thorough understanding of PCA provides a foundation for approaching the fields of machine learning and dimensional reduction. The discussion and explanations in this paper are informal in the spirit of a tutorial. The goal of this paper is to educate. Occasionally, rigorous mathematical proofs are necessary al￾though relegated to the Appendix. Although not as vital to the tutorial, the proofs are presented for the adventurous reader who desires a more complete understanding of the math. My only assumption is that the reader has a working knowledge of linear algebra. My goal is to provide a thorough discussion by largely building on ideas from linear algebra and avoiding challenging topics in statistics and optimization theory (but see Discussion). Please feel free to contact me with any sug￾gestions, corrections or comments. ∗Electronic address: shlens@salk.edu II. MOTIVATION: A TOY EXAMPLE Here is the perspective: we are an experimenter. We are trying to understand some phenomenon by measuring various quan￾tities (e.g. spectra, voltages, velocities, etc.) in our system. Unfortunately, we can not figure out what is happening be￾cause the data appears clouded, unclear and even redundant. This is not a trivial problem, but rather a fundamental obstacle in empirical science. Examples abound from complex sys￾tems such as neuroscience, web indexing, meteorology and oceanography - the number of variables to measure can be unwieldy and at times even deceptive, because the underlying relationships can often be quite simple. Take for example a simple toy problem from physics dia￾grammed in Figure 1. Pretend we are studying the motion of the physicist’s ideal spring. This system consists of a ball of mass m attached to a massless, frictionless spring. The ball is released a small distance away from equilibrium (i.e. the spring is stretched). Because the spring is ideal, it oscillates indefinitely along the x-axis about its equilibrium at a set fre￾quency. This is a standard problem in physics in which the motion along the x direction is solved by an explicit function of time. In other words, the underlying dynamics can be expressed as a function of a single variable x. However, being ignorant experimenters we do not know any of this. We do not know which, let alone how many, axes and dimensions are important to measure. Thus, we decide to measure the ball’s position in a three-dimensional space (since we live in a three dimensional world). Specifically, we place three movie cameras around our system of interest. At 120 Hz each movie camera records an image indicating a two dimen￾sional position of the ball (a projection). Unfortunately, be￾cause of our ignorance, we do not even know what are the real x, y and z axes, so we choose three camera positions~a,~b and~c at some arbitrary angles with respect to the system. The angles between our measurements might not even be 90o ! Now, we
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有