正在加载图片...
segment 1|“ segment2‖ matrix1”“ matrix2 ine1|ine2yll" power 1” I power2” manufactur constraint al pha POWER SEGMENT MATRIX LINE redshift‖ spectrun texture match LINE MATRIX locat galaxi POWER tissue quasar famili Input sIi source condition design high redshift perturb SEGMENT‖root format fundament densiti standard present volume veloc del Figure 3: Eight selected factors from a 128 factor decomposition. The displayed word stems are the 10 most probable words in the class-conditional distribution P(w2), from top to bottom in descending order Document 1, P(d1, u;='segment)=(0.951, 0.0001,...) problem field imag analysi diagnost base proper SE GMENT digit imag SEGMENT medic imag need applic involv estim boundan ob ject classif tissu abnorm shape analysi cotour detec textur SEGmENT despit exist techniqu SEGmENT specif medic imag remain crural problem Document 2, Plz+dy,te 0.0250867,) P onsid signal ongin sequenc sourc specf problem SEGMENT signal rea IENT sourc address issu wide appic field report algonthm forward algonthm observ sequenc baumwelch train es tim hmm rain maten applic multipl signal sourc identif expen p Figure 4: Abstracts of 2 exemplary do cuments from the ClUSteR collection along with latent class posterior probabilities P=d, w='segment and word prob abilities Pw="segment'dy (iv)'Power'is used in the context of radiating ob jects clustering model [10, 7] which can be thought of n astronomy, but also in electrical engineering unsupervised version of a naive Bayes'classifier. It Figure 4 shows the abstr acts of two exemplary docu- can be show n that the conditional word probability of ments which have been pre-processed by a st andard a probabilistic clustering model is given by stop-word list and a stemmer. The posterior probabil- ties for the classes given the different occurrences of P)=∑P4l)=2}P() segment indicate how likely it is for each of the factors n the first pair of F where Pic(d)=a is the posterior probability of doc servation. We have also displayed the estimates of the ument d having latent class 2. It is a simple impli- conditional word probabilities Pw="segment'1d1, 23. cation of Bayes'rule that these posterior probabili One can see that the correct meaning of the word ties will concentrate their probability mass on a cer ment'is identified in both cases. This implies that th reasing number of observations though'segment' occurs frequently in both document, (ie, with the length of the document).This means e overlap in the factored representation is low, since that although(1)and(7)are algebraically equiva- segement'is identi fied as a polysemous word (relative lent, they are conceptually very different and yield in to the chosen resolution level)which -dependent on fact different results. The aspect model assumes that the context -is expl ained by different factors ent-specific distributions are 3.5 Aspects versus Clusters by all de It is worth comparing the aspect model with statistical clustering models the class-conditionals P(w=)have clustering models(cf. also [7). In clustering models In the dist ribut ional clustering model for do cuments, one typically associates a latent cl ass terior uncert ainty of the cluster ignments that induces variable with each do cument in the collection. Most some averaging over the class-conditional word distribu- closely related to our approach is the distrib nal tions P(wl=)\segment 1" \segment 2" \matrix 1" \matrix 2" \line 1" \line 2" \power 1" power 2" imag speaker robust manufactur constraint alpha POWER load SEGMENT speech MATRIX cell LINE redshift spectrum memori texture recogni eigenvalu part match LINE omega vlsi color signal uncertainti MATRIX locat galaxi mpc POWER tissue train plane cellular imag quasar hsup systolic brain hmm linear famili geometr absorp larg input slice source condition design impos high redshift complex cluster speakerind. perturb machinepart segment ssup galaxi arrai mri SEGMENT root format fundament densiti standard present volume sound suci group recogn veloc model implement Figure 3: Eight selected factors from a 128 factor decomposition. The displayed word stems are the 10 most probable words in the class-conditional distribution P (wjz), from top to bottom in descending order. Document 1, P fzkjd1; wj = `segment`g = (0:951; 0:0001;:::) P fwj = `segment`jd1g = 0:06 SEGMENT medic imag challeng problem eld imag analysi diagnost base proper SEGMENT digit imag SEGMENT medic imag need applic involv estim boundari ob ject classif tissu abnorm shape analysi contour detec textur SEGMENT despit exist techniqu SEGMENT specif medic imag remain crucial problem [...] Document 2, P fzkjd2; wj = `segment`g = (0:025; 0:867;:::) P fwj = `segment`jd2g = 0:010 consid signal origin sequenc sourc specif problem SEGMENT signal relat SEGMENT sourc address issu wide applic eld report describ resolu method ergod hidden markov model hmm hmm state correspond signal sourc signal sourc sequenc determin decod procedur viterbi algorithm forward algorithm observ sequenc baumwelch train estim hmm paramet train materi applic multipl signal sourc identif problem experi perform unknown speaker identif [...] Figure 4: Abstracts of 2 exemplary documents from the CLUSTER collection along with latent class posterior probabilities P fzjd; w = `segment'g and word probabilities P fw = `segment'jdg. (iv) 'Power' is used in the context of radiating objects in astronomy, but also in electrical engineering. Figure 4 shows the abstracts of two exemplary docu- ments which have been pre-processed by a standard stop-word list and a stemmer. The posterior probabil￾ities for the classes given the di erent occurrences of `segment' indicate how likely it is for each of the factors in the rst pair of Figure 3 to have generated this ob￾servation. We have also displayed the estimates of the conditional word probabilities P fw = `segment'jd1;2g. One can see that the correct meaning of the word `seg- ment' is identi ed in both cases. This implies that al￾though `segment' occurs frequently in both document, the overlap in the factored representation is low, since `segement' is identi ed as a polysemous word (relative to the chosen resolution level) which { dependent on the context { is explained by di erent factors. 3.5 Aspects versus Clusters It is worth comparing the aspect model with statistical clustering models (cf. also [7]). In clustering models for documents, one typically associates a latent class variable with each document in the collection. Most closely related to our approach is the distributional clustering model [10, 7] which can be thought of as an unsupervised version of a naive Bayes' classi er. It can be shown that the conditional word probability of a probabilistic clustering model is given by P (wjd) = X z2Z P fc(d) = zgP (wjz) ; (7) where P fc(d) = zg is the posterior probability of doc￾ument d having latent class z. It is a simple impli￾cation of Bayes' rule that these posterior probabili￾ties will concentrate their probability mass on a cer￾tain value z with an increasing number of observations (i.e., with the length of the document). This means that although (1) and (7) are algebraically equiva￾lent, they are conceptually very di erent and yield in fact di erent results. The aspect model assumes that document-speci c distributions are a convex combina￾tion of aspects, while the clustering model assumes there is just one cluster-speci c distribution which is inherited by all documents in the cluster.1 Thus in clustering models the class-conditionals P (wjz) have 1 In the distributional clustering model it is only the pos￾terior uncertainty of the cluster assignments that induces some averaging over the class-conditional word distribu￾tions P (wjz)
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有