正在加载图片...
THE S25.000.000.000 EIGENVECTOR 3 2 FIG.2.2.A web of five pages,consisting of two disconnected "subwebs"W1 (pages 1 and 2)and W2 (pages 3, 4,5). page j links to at least page k!).We will assume that a link from a page to itself will not be counted. In this "democracy of the web"you don't get to vote for yourself! Let's apply this approach to the four-page web of Figure 2.1.For page 1 we have x1=z3/1+ x4/2,since pages 3 and 4 are backlinks for page 1 and page 3 contains only one link,while page 4 contains two links (splitting its vote in half).Similarly,x2 =x1/3,3 =1/3+x2/2+x4/2,and x4=z1/3+72/2.These linear equations can be written Ax =x,where x=[r1 z2 3 r4]T and 001 0 (2.2) This transforms the web ranking problem into the "standard"problem of finding an eigenvector for a square matrix!(Recall that the eigenvalues A and eigenvectors x of a matrix A satisfy the equation Ax =Ax,x0 by definition.)We thus seek an eigenvector x with eigenvalue 1 for the matrix A.We will refer to A as the "link matrix"for the given web. It turns out that the link matrix A in equation(2.2)does indeed have eigenvectors with eigen- value 1,namely,all multiples of the vector [12 4 9 6]T (recall that any non-zero multiple of an eigenvector is again an eigenvector).Let's agree to scale these "importance score eigenvectors" so that the components sum to1.in this case we obtain1=号≈0.387,x2=≈0.129, x3-是≈0.290,andx4=7≈0.l94.Note that this ranking differs from that generated by simply counting backlinks.It might seem surprising that page 3,linked to by all other pages,is not the most important.To understand this,note that page 3 links only to page 1 and so casts its entire vote for page 1.This,with the vote of page 2,results in page 1 getting the highest importance score. More generally,the matrix A for any web must have 1 as an eigenvalue if the web in question has no dangling nodes (pages with no outgoing links).To see this,first note that for a general web of n pages formula(2.1)gives rise to a matrix A with Aij =1/nj if page j links to page i,Aij=0 otherwise.The jth column of A then contains nj non-zero entries,each equal to 1/nj,and the column thus sums to 1.This motivates the following definition,used in the study of Markov chains: DEFINITION 2.1.A square matrix is called a column-stochastic matrit if all of its entries are nonnegative and the entries in each column sum to one. The matrix A for a web with no dangling nodes is column-stochastic.We now prove PROPOSITION 1.Every column-stochastic matriz has 1 as an eigenvalue.Proof.Let A be an n x n column-stochastic matrix and let e denote an n dimensional column vector with all entries equal to 1.Recall that A and its transpose AT have the same eigenvalues.Since A is column-stochastic it is easy to see that ATe=e,so that 1 is an eigenvalue for AT and hence for A. In what follows we use V(A)to denote the eigenspace for eigenvalue 1 of a column-stochastic matrix A. 2.2.Shortcomings.Several difficulties arise with using formula(2.1)to rank websites.In this section we discuss two issues:webs with non-unique rankings and webs with dangling nodes. 2.2.1.Non-Unique Rankings.For our rankings it is desirable that the dimension of Vi(A) equal one,so that there is a unique eigenvector x with >iri=1 that we can use for importance scores.This is true in the web of Figure 2.1 and more generally is always true for the special case ofTHE $25,000,000,000 EIGENVECTOR 3 5 2 4 1 3 Fig. 2.2. A web of five pages, consisting of two disconnected “subwebs” W1 (pages 1 and 2) and W2 (pages 3, 4, 5). page j links to at least page k!). We will assume that a link from a page to itself will not be counted. In this “democracy of the web” you don’t get to vote for yourself! Let’s apply this approach to the four-page web of Figure 2.1. For page 1 we have x1 = x3/1 + x4/2, since pages 3 and 4 are backlinks for page 1 and page 3 contains only one link, while page 4 contains two links (splitting its vote in half). Similarly, x2 = x1/3, x3 = x1/3 + x2/2 + x4/2, and x4 = x1/3 + x2/2. These linear equations can be written Ax = x, where x = [x1 x2 x3 x4] T and A =     0 0 1 1 2 1 3 0 0 0 1 3 1 2 0 1 2 1 3 1 2 0 0     . (2.2) This transforms the web ranking problem into the “standard” problem of finding an eigenvector for a square matrix! (Recall that the eigenvalues λ and eigenvectors x of a matrix A satisfy the equation Ax = λx, x 6= 0 by definition.) We thus seek an eigenvector x with eigenvalue 1 for the matrix A. We will refer to A as the “link matrix” for the given web. It turns out that the link matrix A in equation (2.2) does indeed have eigenvectors with eigen￾value 1, namely, all multiples of the vector [12 4 9 6]T (recall that any non-zero multiple of an eigenvector is again an eigenvector). Let’s agree to scale these “importance score eigenvectors” so that the components sum to 1. In this case we obtain x1 = 12 31 ≈ 0.387, x2 = 4 31 ≈ 0.129, x3 = 9 31 ≈ 0.290, and x4 = 6 31 ≈ 0.194. Note that this ranking differs from that generated by simply counting backlinks. It might seem surprising that page 3, linked to by all other pages, is not the most important. To understand this, note that page 3 links only to page 1 and so casts its entire vote for page 1. This, with the vote of page 2, results in page 1 getting the highest importance score. More generally, the matrix A for any web must have 1 as an eigenvalue if the web in question has no dangling nodes (pages with no outgoing links). To see this, first note that for a general web of n pages formula (2.1) gives rise to a matrix A with Aij = 1/nj if page j links to page i, Aij = 0 otherwise. The jth column of A then contains nj non-zero entries, each equal to 1/nj , and the column thus sums to 1. This motivates the following definition, used in the study of Markov chains: Definition 2.1. A square matrix is called a column-stochastic matrix if all of its entries are nonnegative and the entries in each column sum to one. The matrix A for a web with no dangling nodes is column-stochastic. We now prove Proposition 1. Every column-stochastic matrix has 1 as an eigenvalue. Proof. Let A be an n×n column-stochastic matrix and let e denote an n dimensional column vector with all entries equal to 1. Recall that A and its transpose AT have the same eigenvalues. Since A is column-stochastic it is easy to see that AT e = e, so that 1 is an eigenvalue for AT and hence for A. In what follows we use V1(A) to denote the eigenspace for eigenvalue 1 of a column-stochastic matrix A. 2.2. Shortcomings. Several difficulties arise with using formula (2.1) to rank websites. In this section we discuss two issues: webs with non-unique rankings and webs with dangling nodes. 2.2.1. Non-Unique Rankings. For our rankings it is desirable that the dimension of V1(A) equal one, so that there is a unique eigenvector x with P i xi = 1 that we can use for importance scores. This is true in the web of Figure 2.1 and more generally is always true for the special case of
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有