正在加载图片...
-K Fan, C-H. Chang/ Expert Systems with Applications 38(2011)1777-1788 System results for triggering pages Chi-squaI Objective ubjective Classification lenification 52.7 Baseline Subjective 58.2 54 Positive 71 and sentiment models according to experimental results. No more and 42% for Google Adsense). Regarding all triggering pages, our than 15 ads were retrieved and inserted into a pool for ea results show that the proposed bCCa approach can yield a better gering page. Since it is difficult to invite the original authors of blog performance(57%) than other approaches(42% for PCA approach actually to participate in an Ad Click-Through-Rate(CTr) experi- and 40% for Google AdSense). According to Table 7, these results ment, 14 volunteers participated in our CTR experiment. All the lead us to the conclusion that our BCCA framework can place ads advertisements in each pool were manually clicked by volunteers. that are related to the personal interest content of triggering pages We published each triggering page and its ads from the corre Although the results generated by our proposed method are sponding pool on a testing platform. In order to provide a fair test- better than Google's, in this paper we did not emphasize this con- ing environment, we ignored the effect of ad position order and clusion on the basis of two reasons. One reason is that Google Ad- randomly placed the relevant ads on a given triggering page. By Sense needs to select the recommended ads out of an ad pool that reading the content of a given post, the participants were regarded is vastly larger than the one used by us. Another plausible reason as the blog publisher, who then clicked the ads according to per- has to do with ad categories. That is, Google AdSense considers sonal interests. To compare with Google AdSense, we only mea- more ad categories than the categories we adopt ured the CTr for this experiment. The measure is the fraction of o investigate the generalization of our proposed framework. retrieved ads that are clicked. In order to investigate the effective the goal of our next experiment was to explore our ad assignment ness of our proposed method on a different dataset, we further se- strategies (i.e, intention recognition, sentiment detection and term lected a positive sentiment dataset(100 documents) and an expansion) as applied on different information retrieval models. intentional dataset(50 documents)from our triggering pages. Ta- We compared the language model (Lm)with another two well ble 7 shows the results for three page-ad matching methods across known IR algorithms: Okapi BM25(Robertson, Walker, Jones, Han- us types of datasets. In the case of the positive sentiment cock-Beaulieu, Gatford, 1994) and tf* idf(salton Buckley dataset, there are no significant differences among the three 1988). We similarly selected the top-five ranked ads provided by page-ad matching approaches. Our BCCA framework and PCA can these iR algorithms. We used all triggering pages as our dataset. respectively produce 52% and 53% in terms of precision. Google Ad- We thereby ensured that no more than 15 ads would be retrieved Sense achieves about 47% accuracy For the intention dataset, our and inserted into a pool for that triggering page. all the advertise- results show that the proposed BCCa approach can yield a better ments in each pool were manually judged by experts. The experts performance(58%) than other approaches(32% for PCA approach mainly evaluated each page-ad pair according two principles, cor- relation and intention. The correlation principle is related to whether an ad is positively related to the bloggers'interests as re- Table 7 vealed by a given blog page. The intention principle is based on Accuracy of page-ad matching. whether experts have any intention to click an ad. An ad judged as gold-standard has to comply with both correlation and intention principles. The experts were divided into two groups to judge inde- pendently for the gold-standard; furthermore, the average pa Google AdSense wise ag ent measure and Kappa coefficient value between Negative dataset 65 two teams reached 0.93 and 0.85, respectively. the average num- ber of relevant advertisements was 8 per triggering page. The results of our proposed page-ad matching on different IR ap- Intention dataset BCCA proaches are shown in Table 8. As can be seen in Table 8, there are no significant differences in terms of accuracy among three IR approaches. For all triggering pages, using the LM approach All triggering pages can yield better performance(64%)than other approaches(62% Google Adsense for Okapi BM25 and 60% for tf* idf). These results lead us to con- clude that our ad assignment strategies actually can be employedand sentiment models according to experimental results. No more than 15 ads were retrieved and inserted into a pool for each trig￾gering page. Since it is difficult to invite the original authors of blog actually to participate in an Ad Click-Through-Rate (CTR) experi￾ment, 14 volunteers participated in our CTR experiment. All the advertisements in each pool were manually clicked by volunteers. We published each triggering page and its ads from the corre￾sponding pool on a testing platform. In order to provide a fair test￾ing environment, we ignored the effect of ad position order and randomly placed the relevant ads on a given triggering page. By reading the content of a given post, the participants were regarded as the blog publisher, who then clicked the ads according to per￾sonal interests. To compare with Google AdSense, we only mea￾sured the CTR for this experiment. The measure is the fraction of retrieved ads that are clicked. In order to investigate the effective￾ness of our proposed method on a different dataset, we further se￾lected a positive sentiment dataset (100 documents) and an intentional dataset (50 documents) from our triggering pages. Ta￾ble 7 shows the results for three page-ad matching methods across various types of datasets. In the case of the positive sentiment dataset, there are no significant differences among the three page-ad matching approaches. Our BCCA framework and PCA can respectively produce 52% and 53% in terms of precision. Google Ad￾Sense achieves about 47% accuracy. For the intention dataset, our results show that the proposed BCCA approach can yield a better performance (58%) than other approaches (32% for PCA approach and 42% for Google Adsense). Regarding all triggering pages, our results show that the proposed BCCA approach can yield a better performance (57%) than other approaches (42% for PCA approach and 40% for Google AdSense). According to Table 7, these results lead us to the conclusion that our BCCA framework can place ads that are related to the personal interest content of triggering pages. Although the results generated by our proposed method are better than Google’s, in this paper we did not emphasize this con￾clusion on the basis of two reasons. One reason is that Google Ad￾Sense needs to select the recommended ads out of an ad pool that is vastly larger than the one used by us. Another plausible reason has to do with ad categories. That is, Google AdSense considers more ad categories than the categories we adopted. To investigate the generalization of our proposed framework, the goal of our next experiment was to explore our ad assignment strategies (i.e., intention recognition, sentiment detection and term expansion) as applied on different information retrieval models. We compared the language model (LM) with another two well￾known IR algorithms: Okapi BM25 (Robertson, Walker, Jones, Han￾cock-Beaulieu, & Gatford, 1994) and tf  idf (Salton & Buckley, 1988). We similarly selected the top-five ranked ads provided by these IR algorithms. We used all triggering pages as our dataset. We thereby ensured that no more than 15 ads would be retrieved and inserted into a pool for that triggering page. All the advertise￾ments in each pool were manually judged by experts. The experts mainly evaluated each page-ad pair according two principles, cor￾relation and intention. The correlation principle is related to whether an ad is positively related to the bloggers’ interests as re￾vealed by a given blog page. The intention principle is based on whether experts have any intention to click an ad. An ad judged as gold-standard has to comply with both correlation and intention principles. The experts were divided into two groups to judge inde￾pendently for the gold-standard; furthermore, the average pair￾wise agreement measure and Kappa coefficient value between two teams reached 0.93 and 0.85, respectively. The average num￾ber of relevant advertisements was 8 per triggering page. The results of our proposed page-ad matching on different IR ap￾proaches are shown in Table 8. As can be seen in Table 8, there are no significant differences in terms of accuracy among three IR approaches. For all triggering pages, using the LM approach can yield better performance (64%) than other approaches (62% for Okapi BM25 and 60% for tf  idf). These results lead us to con￾clude that our ad assignment strategies actually can be employed Table 7 Accuracy of page-ad matching. Dataset Method CTR (%) Positive dataset BCCA 52 PCA 53 Google AdSense 47 Negative dataset BCCA 65 PCA 30 Google AdSense 33 Intention dataset BCCA 58 PCA 32 Google AdSense 42 All triggering pages BCCA 57 PCA 42 Google AdSense 40 Table 6 System results for triggering pages. Feature set Task Class Precision (%) Recall (%) F-measure (%) Unigram Identification Objective 81.1 88.9 84.8 Subjective 75.8 62.8 67.7 Classification Negative 80.6 46.6 59.1 Positive 70.0 65.0 67.4 Chi-square Identification Objective 80.5 95.0 87.1 Subjective 86.6 58.8 69.9 Classification Negative 63.8 52.6 57.6 Positive 96.0 61.3 74.8 Opinion-bearing words Identification Objective 73.7 88.3 80.3 Subjective 67.2 43.3 52.7 Classification Negative 46.1 42.0 44.0 Positive 89.1 44.0 58.9 Baseline Identification Objective 78.5 64.5 70.8 Subjective 50.1 67.7 58.2 Classification Negative 56.1 53.1 54.6 Positive 56.2 71.1 68.2 T.-K. Fan, C.-H. Chang / Expert Systems with Applications 38 (2011) 1777–1788 1785
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有