正在加载图片...
one type of information into the other.Experimental results to pursue research reported in this paper in [Zhu et al.,2007]show that LCMF can outperform other We propose in this paper a novel MF method.The ba- state-of-the-art methods. sic idea is to make the latent factors of two entities as However,the links in different applications might cap- close as possible if there exists a link between them.More ture different semantics.As a result,the model assumption specifically,we utilize relation information to regularize the adopted by LCMF might not be satisfied for some applica- tions,which would cause LCMF to fail in such cases.Let content MF procedure,resulting in a principled framework which seamlessly integrates content and relation informa- us take a synthetic link structure from Zhu et al.,2007]as tion.We refer to our method as relation regularized matrix an example to illustrate this point.This example is shown factorization,which will be abbreviated as RRMF in the se- in Figure 1(a),in which there are eight nodes,corresponding to eight entities,and eight directed links.After performing quel for convenience.To learn the parameters of RRMF,we propose a linear-time algorithm,which makes RRMF suitable link MF based on Zhu et al.'s method,the entities will be for large-scale problems.Furthermore,the learning proce- automatically grouped into five clusters,each corresponding dure of RRMF can be proved to be convergent.Experimen- to one ellipse in Figure 1(a).The learned latent factors of the tal results on a data set with Type II links demonstrate that entities [Zhu et al.,2007]are shown in Figure 1(b),in which RRMF can dramatically outperform LCMF and other state- the latent factors for VI to V8 are listed from the first row to of-the-art methods. the last row in order.We can see that the entities in each clus- ter,say V2 and V3,have the same latent factor representation. Although RRMF is motivated to model Type II links,it From this example,it is not difficult to reveal the model as- can also be used for applications with Type I links by adding sumption behind LCMF:if two entities link to or are linked a simple step to preprocess the link structure of the data.This by one common entity,the two entities will have similar la- will be discussed in detail in the following sections. tent factor representations.This is a very promising property if the application scenario indeed satisfies this assumption. One such application is the web page classification task on 2 Relation Regularized Matrix Factorization WebKB [Craven et al.,1998].For example,the homepages of faculties are always linked by or link to the homepage of the department,but the homepages of two faculties seldom 2.1 Notations link to each other.The advantage of this property has been verified by the promising accuracy of LCMF on the WebKB We use boldface uppercase letters,such as K,to denote ma- data set Zhu et al..2007 trices,and boldface lowercase letters,such as z,to denote vectors.The ith row and the ith column of a matrix K are denoted as Ki and K.i,respectively.Kii denotes the ele- 8 ment at the ith row and ith column in K.zi denotes the ith element in z.X denotes the content matrix of size n x m, (a)Synthetic link structure (b)Result of link MF with n being the number of entities and m the number of fea- Figure 1:A synthetic example of link structure from [Zhu et al., tures.A is the adjacency matrix of the n entities.D denotes 2007]for illustration. the number of latent factors.I denotes the identity matrix whose dimensionality depends on the context.For a matrix However,one problem with LCMF is that the factor rep- K,K 0 means that K is positive semi-definite (psd)and resentations of two linked entities are not guaranteed to be K>0 means that K is positive definite (pd) similar.For example,the latent factor representation of V8 is relatively far away from that of V6 or V7 after LCMF learning.This property is undesirable for many other appli- 2.2 Model Formulation cations.For example,to classify research papers according to their topic,the citation (link)information is usually very important.More specifically,two papers with a citation rela- Like in LSI [Deerwester et al.,1990],we adopt a similar MF tion between them are most likely about the same topic,and method to approximate the content matrix: consequently,the learned latent factors of two linked papers 映X-Uv22+UP+IVIP), (1) should be similar.Unfortunately.this cannot be handled well by LCMF,which is also verified by the relatively poor perfor- where we use an n x D matrix U and an m x D matrix V mance of LCMF on the Cora data set McCallum et al..2000] to denote the latent D-dimensional representations of all the for paper classification. documents(or entities in a more general case)and words (or From the analysis above,we can see that there exist at least features in a more general case),respectively,with U,.for two types of links with different semantics.In this paper,we document i and Vi.for word j.a is a hyperparameter. call the first type of links,such as those in the WebKB data To integrate the relation (link)information into the MF pro- set,Type I links,and the other type of links,such as those in cedure,we use the relations among entities to regularize the the Cora data set,Tipe II links.As discussed above,LCMF latent factors.The basic idea of our method is to make the la- can handle well applications with Type I links but not those tent representations oftwo entities as close as possible if there with Type II links.It is this limitation that has motivated us exists a relation between them.We can achieve this goal byone type of information into the other. Experimental results in [Zhu et al., 2007] show that LCMF can outperform other state-of-the-art methods. However, the links in different applications might cap￾ture different semantics. As a result, the model assumption adopted by LCMF might not be satisfied for some applica￾tions, which would cause LCMF to fail in such cases. Let us take a synthetic link structure from [Zhu et al., 2007] as an example to illustrate this point. This example is shown in Figure 1(a), in which there are eight nodes, corresponding to eight entities, and eight directed links. After performing link MF based on Zhu et al. ’s method, the entities will be automatically grouped into five clusters, each corresponding to one ellipse in Figure 1(a). The learned latent factors of the entities [Zhu et al., 2007] are shown in Figure 1(b), in which the latent factors for V1 to V8 are listed from the first row to the last row in order. We can see that the entities in each clus￾ter, say V2 and V3, have the same latent factor representation. From this example, it is not difficult to reveal the model as￾sumption behind LCMF: if two entities link to or are linked by one common entity, the two entities will have similar la￾tent factor representations. This is a very promising property if the application scenario indeed satisfies this assumption. One such application is the web page classification task on WebKB [Craven et al., 1998]. For example, the homepages of faculties are always linked by or link to the homepage of the department, but the homepages of two faculties seldom link to each other. The advantage of this property has been verified by the promising accuracy of LCMF on the WebKB data set [Zhu et al., 2007]. V1 V8 V6 V7 V4 V5 V2 V3 -.8 -.5 .3 -.1 -.0 -.0 .4 .6 -.1 -.4 -.0 .4 .6 -.1 -.4 .3 -.2 .3 -.4 .3 .3 -.2 .3 -.4 .3 -.4 .5 .0 -.2 .6 -.4 .5 .0 -.2 .6 -.1 .1 -.4 -.8 -.4 (a) Synthetic link structure (b) Result of link MF Figure 1: A synthetic example of link structure from [Zhu et al. , 2007] for illustration. However, one problem with LCMF is that the factor rep￾resentations of two linked entities are not guaranteed to be similar. For example, the latent factor representation of V8 is relatively far away from that of V6 or V7 after LCMF learning. This property is undesirable for many other appli￾cations. For example, to classify research papers according to their topic, the citation (link) information is usually very important. More specifically, two papers with a citation rela￾tion between them are most likely about the same topic, and consequently, the learned latent factors of two linked papers should be similar. Unfortunately, this cannot be handled well by LCMF, which is also verified by the relatively poor perfor￾mance of LCMF on the Cora data set [McCallum et al., 2000] for paper classification. From the analysis above, we can see that there exist at least two types of links with different semantics. In this paper, we call the first type of links, such as those in the WebKB data set, Type I links, and the other type of links, such as those in the Cora data set, Type II links. As discussed above, LCMF can handle well applications with Type I links but not those with Type II links. It is this limitation that has motivated us to pursue research reported in this paper. We propose in this paper a novel MF method. The ba￾sic idea is to make the latent factors of two entities as close as possible if there exists a link between them. More specifically, we utilize relation information to regularize the content MF procedure, resulting in a principled framework which seamlessly integrates content and relation informa￾tion. We refer to our method as relation regularized matrix factorization, which will be abbreviated as RRMF in the se￾quel for convenience. To learn the parameters of RRMF, we propose a linear-time algorithm, which makes RRMF suitable for large-scale problems. Furthermore, the learning proce￾dure of RRMF can be proved to be convergent. Experimen￾tal results on a data set with Type II links demonstrate that RRMF can dramatically outperform LCMF and other state￾of-the-art methods. Although RRMF is motivated to model Type II links, it can also be used for applications with Type I links by adding a simple step to preprocess the link structure of the data. This will be discussed in detail in the following sections. 2 Relation Regularized Matrix Factorization 2.1 Notations We use boldface uppercase letters, such as K, to denote ma￾trices, and boldface lowercase letters, such as z, to denote vectors. The ith row and the jth column of a matrix K are denoted as Ki∗ and K∗j , respectively. Kij denotes the ele￾ment at the ith row and jth column in K. zi denotes the ith element in z. X denotes the content matrix of size n × m, with n being the number of entities and m the number of fea￾tures. A is the adjacency matrix of the n entities. D denotes the number of latent factors. I denotes the identity matrix whose dimensionality depends on the context. For a matrix K, K  0 means that K is positive semi-definite (psd) and K  0 means that K is positive definite (pd). 2.2 Model Formulation Like in LSI [Deerwester et al., 1990], we adopt a similar MF method to approximate the content matrix: min U,V 1 2 kX − UVT k 2 + α 2 (kUk 2 + kVk 2 ), (1) where we use an n × D matrix U and an m × D matrix V to denote the latent D-dimensional representations of all the documents (or entities in a more general case) and words (or features in a more general case), respectively, with Ui∗ for document i and Vj∗ for word j. α is a hyperparameter. To integrate the relation (link) information into the MF pro￾cedure, we use the relations among entities to regularize the latent factors. The basic idea of our method is to make the la￾tent representations of two entities as close as possible if there exists a relation between them. We can achieve this goal by
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有