ELSEVIER Expert Systems with Applications 28(2005)381-393 Development of a recommender system based on navigational and behavioral patterns of customers in e-commerce sites Yong Soo Kim, Bong-Jin Yum", Junehwa Song, Su Myeon Kim Korea Advanced Institute of Science and Technology, 373-1 Gusung-Dong, Yusung-Gu, Daejon 305-701, South Korea In this article, a novel CF(collaborative filtering)-based recommender system is developed for e-commerce sites. Unlike the conventional approach in which only binary purchase data are used, the proposed approach analyzes the data captured from the navigational and behavioral patterns of customers, estimates the preference levels of a customer for the products which are clicked but not purchased, and CF is conducted using the preference levels for making recommendations. This also compares with the existing works on clickstream data analysis in which the navigational and behavioral patterns of customers are analyzed for simple relationships with the target variable. The effectiveness of the proposed approach is assessed using an experimental e-commerce site. It is found among other things that the proposed approach outperforms the conventional approach in almost all cases considered. The proposed approach is versatile and can be applied to a variety of e-commerce sites as long as the navigational and behavioral patterns of customers can be captured. C 2004 Elsevier Ltd. All rights reserved. Keywords: Recommender system; Collaborative filtering: E-commerce; Preference level 1. Introduction particularly useful in e-commerce sites that offer millions of products for sale Personalized services for individual customers are now There are two paradigms for recommender systems, popular in e-commerce sites. Properly designed and well- namely, collaborative filtering (CF) and content-based executed personalized services enable e-commerce compa filtering(CBF). CF recommends products based on the nies to capture the unique needs and preferences of similarity of the preferences of a group of customers known individual customers, help them build customer loya as a neighbor (Hill, Stead, Rosenstein, Furnas, 1995 and thereby, strengthen their competitiveness in the Resnick, lacovo, Suchak, Bergstrom,&Riedle, 1994 marketplace Shardanand Maes, 1995). On the other hand, CBF A recommender system is a typical software solution recommends products to a customer based on the products used in e-commerce for personalized services(Berson similarity to the customers past or historical preferences Smith, Thearing, 2000: Lawrence, Almasi, Korlyar ( Basu, Hirsh, Cohen, 1998; Krulwich Burkey, 1996; Viveros, Duri, 2001: Sarwar, Karypis, Konstan, rie Lang, 1995). Therefore, CBF may not be suitable for 2000: Yuan Chang, 2001). It helps customers find the recommending such products as music, art, movie, audio, products they would like to purchase by providing photograph, video, etc. which are frequently sold in recommendations based on their preferences, and is e-commerce sites since these products may not be easily analyzed for relevant attributive information(Balabanovic shoham, 1997; Shardanand Maes, 1995). For this Correspon thor.Address: Department of Industrial Engineering, reason, CF is adopted in the present study which deals with Korea Advanced Institute of Science and Technology, 373-1 Gusung- recommendations in e-commerce sites Dong, Yusung-Gu, Daejon 305-701, South Korea. Tel. +82 428693116: Conventional cf is known to work well for the case fax:+82428693110. E-mail addresses: yskim95@kaist. ac kr(YS. Kim), bryum(kaist. ac kr where customers show their preferences for specific (B.J. Yum), junesong @kaist. ac kr ( Song), sumyeon @ kaist. ac kr products in an explicit manner(e.g. rating movies) (S M. Kim). However, CF usually does not work well with binary data 0957-4174/.see front matter 2004 Elsevier Ltd. All rights reserved doi:10.1016 j.eswa200410.017
Development of a recommender system based on navigational and behavioral patterns of customers in e-commerce sites Yong Soo Kim, Bong-Jin Yum*, Junehwa Song, Su Myeon Kim Korea Advanced Institute of Science and Technology, 373-1 Gusung-Dong, Yusung-Gu, Daejon 305-701, South Korea Abstract In this article, a novel CF (collaborative filtering)-based recommender system is developed for e-commerce sites. Unlike the conventional approach in which only binary purchase data are used, the proposed approach analyzes the data captured from the navigational and behavioral patterns of customers, estimates the preference levels of a customer for the products which are clicked but not purchased, and CF is conducted using the preference levels for making recommendations. This also compares with the existing works on clickstream data analysis in which the navigational and behavioral patterns of customers are analyzed for simple relationships with the target variable. The effectiveness of the proposed approach is assessed using an experimental e-commerce site. It is found among other things that the proposed approach outperforms the conventional approach in almost all cases considered. The proposed approach is versatile and can be applied to a variety of e-commerce sites as long as the navigational and behavioral patterns of customers can be captured. q 2004 Elsevier Ltd. All rights reserved. Keywords: Recommender system; Collaborative filtering; E-commerce; Preference level 1. Introduction Personalized services for individual customers are now popular in e-commerce sites. Properly designed and wellexecuted personalized services enable e-commerce companies to capture the unique needs and preferences of individual customers, help them build customer loyalty, and thereby, strengthen their competitiveness in the marketplace. A recommender system is a typical software solution used in e-commerce for personalized services (Berson, Smith, & Thearing, 2000; Lawrence, Almasi, Korlyar, Viveros, & Duri, 2001; Sarwar, Karypis, Konstan, & Riedl, 2000; Yuan & Chang, 2001). It helps customers find the products they would like to purchase by providing recommendations based on their preferences, and is particularly useful in e-commerce sites that offer millions of products for sale. There are two paradigms for recommender systems, namely, collaborative filtering (CF) and content-based filtering (CBF). CF recommends products based on the similarity of the preferences of a group of customers known as a neighbor (Hill, Stead, Rosenstein, & Furnas, 1995; Resnick, Iacovou, Suchak, Bergstrom, & Riedle, 1994; Shardanand & Maes, 1995). On the other hand, CBF recommends products to a customer based on the products’ similarity to the customer’s past or historical preferences (Basu, Hirsh, & Cohen, 1998; Krulwich & Burkey, 1996; Lang, 1995). Therefore, CBF may not be suitable for recommending such products as music, art, movie, audio, photograph, video, etc. which are frequently sold in e-commerce sites since these products may not be easily analyzed for relevant attributive information (Balabanovic & Shoham, 1997; Shardanand & Maes, 1995). For this reason, CF is adopted in the present study which deals with recommendations in e-commerce sites. Conventional CF is known to work well for the case where customers show their preferences for specific products in an explicit manner (e.g. rating movies). However, CF usually does not work well with binary data 0957-4174/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2004.10.017 Expert Systems with Applications 28 (2005) 381–393 www.elsevier.com/locate/eswa * Corresponding author. Address: Department of Industrial Engineering, Korea Advanced Institute of Science and Technology, 373-1 GusungDong, Yusung-Gu, Daejon 305-701, South Korea. Tel.: C82 42869 3116; fax: C82 42869 3110. E-mail addresses: yskim95@kaist.ac.kr (Y.S. Kim), bjyum@kaist.ac.kr (B.-J. Yum), junesong@kaist.ac.kr (J. Song), sumyeon@kaist.ac.kr (S.M. Kim)
Y.S. Kim et al. Expert Systems with Applications 28(2005)381-393 (e.g. 'purchase'or'no purchasedata)which are typical of are predicted. Finally, a Top-N list of products is generated e-commerce data(Hayes, Cunningham th,2001) as a recommendation to the customer To overcome this problem, recent studies proposed methods To illustrate and assess the effectiveness of the proposed that relate the customers' navigational and behavioral approach, an empirical study was conducted by constructing patterns with their preferences( Claypool, Le, Wased, an experimental e-commerce site for compact Brown, 2001: Kelly Belkin, 2001; Lee, Podlaeck, albums. It was opened to the students of Korea Advance Schonberg, Hoch, 2001; Lee, Podlaeck, Schonberg, Institute of Science and Technology(KAIST) for a period of Hoch, Gomory, 2000: Morita Shinoda, 1994: Nichols, 50 days. Then, the relative performance (i.e. prediction 1997: Rafter& Smyth, 2001). Instead of explicitly acquiring accuracy) of the proposed recommender system is com the customers'ratings for specific products, these 'implicit pared with that of the conventional system in which only the ratings'methods passively monitor the navigational and binary purchase data are used. The results from the above behavioral patterns of customers(Nichols, 1997)and derive experimental study clearly show that the proposed method their preference levels(i.e implicit ratings)by analyzing the using the preference data is superior to the conventional lickstream data which represent the navigational and method using only the binary purchase data. behavioral patterns of the customers( Claypool et al In the performance study, we use the FI value(Sarwar 01: Kelly Belkin, 2001; Rafter Smyth, 2001). et al., 2000) as the metric and come up with some additional In addition, several authors presented detailed case studies findings: (i)constrained Pearson correlation coefficient of the clickstream data analysis from various e-commerce( CPC) as a similarity measure performs consistently better sites Lee et al., 2000, 2001). In their studies, customers' than Pearson correlation coefficient and/or Jaccard coeffi shopping patterns(e.g. product impression, click-through, cient for both approaches; (ii) if CPC is used, then the basket placement, and purchase) are analyzed, and the so- proposed approach outperforms the conventional approach alled micro-conversion rate for each adjacent pair of in almost all cases considered; and (iii) the proposed parameters is computed to assess the effectiveness of web approach performs best when LR is used for predicting the merchandising. For a review and classification of various preference levels, CPC is used as a similarity measure, and implicit measures of customer interests, the reader is the size of recommendation is small referred to Kelly and Teevan (2003)or Oard and Kim The rest of this article is organi (2001) Section 2, details of the proposed method are presented, and The existing works on implicit ratings mainly consider a the results of the experimental study are described in simple correlation between a behavioral or a navigational Section 3. Finally, Section 4 presents the conclusion and parameter(e.g. length of reading time, number of visits, future research directions book marking variable, etc )and the target variable(e.g. purchase/no-purchase variable). They are limited in pre- dicting the target variable in that the observed implicit parameters are not considered in a simultaneous manner 2. Proposed recommender system In this article, we extend the existing methods of implicit ntings and further develop a recommender system. The 2.1. Captured data from e-commerce sites system provides a framework to analyze the inter-relation- ship between different behavioral and/or navigational The proposed recommender system is developed based parameters and to numerically determine customers' on the customers'navigational and behavioral patterns in preference levels from their behavioral and navigational e-commerce sites. Navigational patterns include browsing, patterns. Moreover, it can quantitatively predict the target searching, product click, basket placement, and actual variable from those parameters purchase, while behavioral patterns consist of the click ratio The proposed method consists of the following four for a certain type of product, length of reading time spent on First, the data related to a customer's purchase a specific product, number of visits to a specific product, navigational, and behavioral patterns are collected. Second, printing, and bookmarking. Although the proposed system the customer's preference for a certain product is numeri- is developed using an experimental e-commerce site as an cally determined. If the product is purchased, the corre- example, it can be applied to a variety of e-commerce sites sponding preference level is set to 1. If the product is clicked as long as the above navigational and behavioral patterns but not purchased, then the preference level is determined can be captured. y estimating the probability of reaching the point of The product taxonomy in an e-commerce site generally purchase using the data gathered from the first phase. This has a hierarchical structure. For instance, Fig. I shows such process is carried out using the decision tree(DT)analysis, a hierarchical structure for the experimental e-commerce logistic regression(LR)analysis, or artificial neural network ite used in the present study. More specifically, there are (ANN). Third, CF is performed using the preference levels seven genres at Level 1, and each genre has 3-8 different calculated in the second phase as the input values, and the types of CD's at Level 2. Finally, each type at Level 2 has preference levels of a customer for the products not clicked about 20-1000 different CD's
(e.g. ‘purchase’ or ‘no purchase’ data) which are typical of e-commerce data (Hayes, Cunningham, & Smyth, 2001). To overcome this problem, recent studies proposed methods that relate the customers’ navigational and behavioral patterns with their preferences (Claypool, Le, Wased, & Brown, 2001; Kelly & Belkin, 2001; Lee, Podlaeck, Schonberg, & Hoch, 2001; Lee, Podlaeck, Schonberg, Hoch, & Gomory, 2000; Morita & Shinoda, 1994; Nichols, 1997; Rafter & Smyth, 2001). Instead of explicitly acquiring the customers’ ratings for specific products, these ‘implicit ratings’ methods passively monitor the navigational and behavioral patterns of customers (Nichols, 1997) and derive their preference levels (i.e. implicit ratings) by analyzing the clickstream data which represent the navigational and behavioral patterns of the customers (Claypool et al., 2001; Kelly & Belkin, 2001; Rafter & Smyth, 2001). In addition, several authors presented detailed case studies of the clickstream data analysis from various e-commerce sites (Lee et al., 2000, 2001). In their studies, customers’ shopping patterns (e.g. product impression, click-through, basket placement, and purchase) are analyzed, and the socalled micro-conversion rate for each adjacent pair of parameters is computed to assess the effectiveness of web merchandising. For a review and classification of various implicit measures of customer interests, the reader is referred to Kelly and Teevan (2003) or Oard and Kim (2001). The existing works on implicit ratings mainly consider a simple correlation between a behavioral or a navigational parameter (e.g. length of reading time, number of visits, book marking variable, etc.) and the target variable (e.g. purchase/no-purchase variable). They are limited in predicting the target variable in that the observed implicit parameters are not considered in a simultaneous manner. In this article, we extend the existing methods of implicit ratings and further develop a recommender system. The system provides a framework to analyze the inter-relationship between different behavioral and/or navigational parameters and to numerically determine customers’ preference levels from their behavioral and navigational patterns. Moreover, it can quantitatively predict the target variable from those parameters. The proposed method consists of the following four phases. First, the data related to a customer’s purchase, navigational, and behavioral patterns are collected. Second, the customer’s preference for a certain product is numerically determined. If the product is purchased, the corresponding preference level is set to 1. If the product is clicked but not purchased, then the preference level is determined by estimating the probability of reaching the point of purchase using the data gathered from the first phase. This process is carried out using the decision tree (DT) analysis, logistic regression (LR) analysis, or artificial neural network (ANN). Third, CF is performed using the preference levels calculated in the second phase as the input values, and the preference levels of a customer for the products not clicked are predicted. Finally, a Top-N list of products is generated as a recommendation to the customer. To illustrate and assess the effectiveness of the proposed approach, an empirical study was conducted by constructing an experimental e-commerce site for compact disc (CD) albums. It was opened to the students of Korea Advance Institute of Science and Technology (KAIST) for a period of 50 days. Then, the relative performance (i.e. prediction accuracy) of the proposed recommender system is compared with that of the conventional system in which only the binary purchase data are used. The results from the above experimental study clearly show that the proposed method using the preference data is superior to the conventional method using only the binary purchase data. In the performance study, we use the F1 value (Sarwar et al., 2000) as the metric and come up with some additional findings: (i) constrained Pearson correlation coefficient (CPC) as a similarity measure performs consistently better than Pearson correlation coefficient and/or Jaccard coeffi- cient for both approaches; (ii) if CPC is used, then the proposed approach outperforms the conventional approach in almost all cases considered; and (iii) the proposed approach performs best when LR is used for predicting the preference levels, CPC is used as a similarity measure, and the size of recommendation is ‘small’. The rest of this article is organized as follows. In Section 2, details of the proposed method are presented, and the results of the experimental study are described in Section 3. Finally, Section 4 presents the conclusion and future research directions. 2. Proposed recommender system 2.1. Captured data from e-commerce sites The proposed recommender system is developed based on the customers’ navigational and behavioral patterns in e-commerce sites. Navigational patterns include browsing, searching, product click, basket placement, and actual purchase, while behavioral patterns consist of the click ratio for a certain type of product, length of reading time spent on a specific product, number of visits to a specific product, printing, and bookmarking. Although the proposed system is developed using an experimental e-commerce site as an example, it can be applied to a variety of e-commerce sites as long as the above navigational and behavioral patterns can be captured. The product taxonomy in an e-commerce site generally has a hierarchical structure. For instance, Fig. 1 shows such a hierarchical structure for the experimental e-commerce site used in the present study. More specifically, there are seven genres at Level 1, and each genre has 3–8 different types of CD’s at Level 2. Finally, each type at Level 2 has about 20–1000 different CD’s. 382 Y.S. Kim et al. / Expert Systems with Applications 28 (2005) 381–393
Y.S. Kim et al. Expert Systems with Applications 28(2005)381-393 information that can be obtained from the customers actions within the site include: (i) the time it takes for the customer to read about a specific product(length of reading time); (ii) the number of visits to a specific product(number New age Rock Classic(genre) of visits); and (ii) the category to which the product belongs. A product that is frequently viewed and read can be surmised as a popular product. Furthermore, products in a certain category with a high click ratio can also be Hard rock Modern rock Folk rock- (specific type considered popular. For instance, if the click ratio for the Classic cd's is higher than the rock cd's at level i in Fig. 1, this could mean that the customer enjoys classic CD1 CD2 CD100 music more than rock Table 1 shows the parameters which describe the Fig 1 Product taxonomy of experimental CD e-c behavioral and navigational patterns of a customer in the experimental e-commerce site. Then, for each customer who visits the site and clicks at least one product, the Browsing corresponding parameter values are captured and summar ↓一1m ized as shown in Table 2 In Table 2, acase corresponds to Searching a product clicked. Note that several cases may exist for a customer. Hereafter, the term'customer' is used to represent Length isitsading time a customer who visits the site and clicks at least one Which ry does a clicked product belong to? 2.2.P Fig. 2. Possible actions that can be taken by customers in e-commerce sites be obtained from such action The proposed methodology consists of the following four Fig 2 illustrates possible actions and steps that customers phases can take in an e-commerce site, ranging from the point of Phase I All the data related to the purchase, navigational logging-in to the web site to the point of actual purchase of and behavioral patterns are gathered as shown in product. It also indicates the possible data that can be Tables I and 2. Descriptive statistics are also gathered from these actions lculated and analyze After logging-in to the web site, a customer can either Phase II For each customer, the preference level of a browse through the site just to check whether there are product which is clicked but not purchased is interesting products or intentionally search for a specific estimated(the preference level of a purchased product to purchase. When the customer clicks a product, he product is set to 1) or she will be provided with specific information. Then, the Phase Ill CF is performed using the preference levels in customer can either print or bookmark the page as a hase Il as input values, and the preference levels reference for a future purchase or compare the details of the of a customer for the products not clicked are product with other available goods. Other important predicted. Data collected from the experimental e-commerce site Parameters Click type Binary variable: searching=l; browsing=0 Discrete variable Length of reading time Continuous variable(s) Binary variable: print=1: no print=0 king statu Binary variable: bookmarking= 1: no bookmarking=0 Level 1 click ratio (genre) Continuous variable defined for each product k clicked by customer i. Letj be the category (at Level 1)to which product k belongs. Then, Level I click ratio for product, k=(Total number of products clicked by customer i that :long to category j at Level 3 number of products clicked by customer i) Level 2 click ratio(specific type Continuous variable defined fo product k clicked by customer i. Let be the category(at Level 2)to which oduct k belongs. Then, Lev ratio for product, k=(Total number of products clicked by customer i that :long to category j at Level I number of products clicked by customer i) Basket placement status inary variable: basket placement=l: no basket placement=0 Binary variable: purchase= 1: no purchase=0
Fig. 2 illustrates possible actions and steps that customers can take in an e-commerce site, ranging from the point of logging-in to the web site to the point of actual purchase of a product. It also indicates the possible data that can be gathered from these actions. After logging-in to the web site, a customer can either browse through the site just to check whether there are interesting products or intentionally search for a specific product to purchase. When the customer clicks a product, he or she will be provided with specific information. Then, the customer can either print or bookmark the page as a reference for a future purchase or compare the details of the product with other available goods. Other important information that can be obtained from the customer’s actions within the site include: (i) the time it takes for the customer to read about a specific product (length of reading time); (ii) the number of visits to a specific product (number of visits); and (iii) the category to which the product belongs. A product that is frequently viewed and read can be surmised as a popular product. Furthermore, products in a certain category with a high click ratio can also be considered popular. For instance, if the click ratio for the Classic CD’s is higher than the Rock CD’s at Level 1 in Fig. 1, this could mean that the customer enjoys classic music more than rock. Table 1 shows the parameters which describe the behavioral and navigational patterns of a customer in the experimental e-commerce site. Then, for each customer who visits the site and clicks at least one product, the corresponding parameter values are captured and summarized as shown in Table 2. In Table 2, a ‘case’ corresponds to a product clicked. Note that several cases may exist for a customer. Hereafter, the term ‘customer’ is used to represent a customer who visits the site and clicks at least one product. 2.2. Proposed methodology The proposed methodology consists of the following four phases Phase I All the data related to the purchase, navigational, and behavioral patterns are gathered as shown in Tables 1 and 2. Descriptive statistics are also calculated and analyzed. Phase II For each customer, the preference level of a product which is clicked but not purchased is estimated (the preference level of a purchased product is set to 1). Phase III CF is performed using the preference levels in Phase II as input values, and the preference levels of a customer for the products not clicked are predicted. Fig. 2. Possible actions that can be taken by customers in e-commerce sites and possible data that can be obtained from such actions. Table 1 Data collected from the experimental e-commerce site Parameters Descriptions Click type Binary variable: searchingZ1; browsingZ0 Number of visits Discrete variable Length of reading time Continuous variable (s) Print status Binary variable: printZ1; no printZ0 Bookmarking status Binary variable: bookmarkingZ1; no bookmarkingZ0 Level 1 click ratio (genre) Continuous variable defined for each product k clicked by customer i. Let j be the category (at Level 1) to which product k belongs. Then, Level 1 click ratio for product, kZ(Total number of products clicked by customer i that belong to category j at Level 1)/(Total number of products clicked by customer i) Level 2 click ratio (specific type) Continuous variable defined for each product k clicked by customer i. Let j be the category (at Level 2) to which product k belongs. Then, Level 2 click ratio for product, kZ(Total number of products clicked by customer i that belong to category j at Level 2)/(Total number of products clicked by customer i) Basket placement status Binary variable: basket placementZ1; no basket placementZ0 Purchase status Binary variable: purchaseZ1; no purchaseZ0 Fig. 1. Product taxonomy of experimental CD e-commerce site. Y.S. Kim et al. / Expert Systems with Applications 28 (2005) 381–393 383
384 Y.S. Kim et al. Expert Systems with Applications 28(2005)381-393 Table 2 Structure of collected data(example) Case Customer CD Click type ength of No of visits Level I ratio Level 2 ratio Basket Purchase eading time placement 0.33 234 0.33 0.33 2222 00010 555 000010 0.25 Phase IV After making a Top-N list, recommendations are (3)Determination of the preference level of a product made to each customer which is clicked but not purchased for each custo- mer: The preference level of a product which is placed in In Phase Il, the preference level of a product which is the basket but not purchased is set to p On the other clicked but not purchased is estimated according to the hand, the preference level of the product which is following three steps clicked but not placed in the basket is set to(bXp (1) Estimation of the probability of purchase after basket In Phase Ill, CF is conducted using the preference levels determined in Phase II as input values. In a conventional Total number of cases in which product is purchased recommender system, only the purchase status is used for Total number of cases in which product is placed in basket laced in basket CF. In other words, only Os(no purchase) and I's (purchase) are used as input data(refer to Fig. 3(a)). In (2)Estimation of the probability of basket placement for a the point of purchase is estimated for a product clicked by a product which is clicked but not placed in the basket customer. Therefore, a stream of values between 0 and 1 are (b): In the case where a clicked product is not placed in used as input data for the proposed CF(refer to Fig 3(b).In the basket, the probability that the product would be Fig 3, blank cells indicate that the corresponding products purchased is difficult to estimate by simply using the are not clicked parameters in Table 1. In this case, the probability that the product would be placed in the basket after being clicked (b) is first estimated. This is done using Dt analysis, ANN, or LR analysis. In these analyses, basket 3. Experimental evaluation placement status is considered as the target variable, while all the other variables, excluding the purchase 3.1. Data sets status,are regarded as input variables. In DT analysis, the probabilities of reaching basket placement are The experimental e-commerce site was opened to the estimated by following the paths of the constructed students of KAIST for a period of about 50 days. Among the tree. In ANN or LR analysis, the probabilities of 2465 albums that were actually clicked by the customers reaching basket placement are determined as the (i.e. among the 2465 cases observed), 338 albums were purchased. An example data set is shown in Table 2 CD1 CD2 CD3 CD4 CD1 CD2 CD3 CD4 00 082044 Customer 10.15 Customer 5 Customer 5 (a) Conventional Recommender System (b) Proposed Recommender System Fig 3. 'Customer-product preference level matrix'for CF: conventional vs. proposed recommender systems
Phase IV After making a Top-N list, recommendations are made to each customer. In Phase II, the preference level of a product which is clicked but not purchased is estimated according to the following three steps. (1) Estimation of the probability of purchase after basket placement (p): p Z Total number of cases in which product is purchased Total number of cases in which product is placed in basket (2) Estimation of the probability of basket placement for a product which is clicked but not placed in the basket (b):In the case where a clicked product is not placed in the basket, the probability that the product would be purchased is difficult to estimate by simply using the parameters in Table 1. In this case, the probability that the product would be placed in the basket after being clicked (b) is first estimated. This is done using DT analysis, ANN, or LR analysis. In these analyses, basket placement status is considered as the target variable, while all the other variables, excluding the purchase status, are regarded as input variables. In DT analysis, the probabilities of reaching basket placement are estimated by following the paths of the constructed tree. In ANN or LR analysis, the probabilities of reaching basket placement are determined as the predicted values. (3) Determination of the preference level of a product which is clicked but not purchased for each customer:The preference level of a product which is placed in the basket but not purchased is set to p. On the other hand, the preference level of the product which is clicked but not placed in the basket is set to (b!p). In Phase III, CF is conducted using the preference levels determined in Phase II as input values. In a conventional recommender system, only the purchase status is used for CF. In other words, only 0’s (no purchase) and 1’s (purchase) are used as input data (refer to Fig. 3(a)). In the proposed approach, however, the probability of reaching the point of purchase is estimated for a product clicked by a customer. Therefore, a stream of values between 0 and 1 are used as input data for the proposed CF (refer to Fig. 3(b)). In Fig. 3, blank cells indicate that the corresponding products are not clicked. 3. Experimental evaluation 3.1. Data sets The experimental e-commerce site was opened to the students of KAIST for a period of about 50 days. Among the 2465 albums that were actually clicked by the customers (i.e. among the 2465 cases observed), 338 albums were purchased. An example data set is shown in Table 2. Table 2 Structure of collected data (example) Case Customer CD Click type Length of reading time No. of visits Level 1 ratio Level 2 ratio Basket placement Purchase 1 1 A 1 49 2 0.67 0.33 1 1 2 1 B 1 15 1 0.67 0.33 1 0 3 1 C 0 4 1 0.33 0.33 0 0 4 2 A 0 6 1 0.75 0.50 0 0 5 2 C 0 8 1 0.75 0.50 0 0 6 2 D 1 12 1 0.25 0.25 1 1 7 2 E 0 6 1 0.25 0.25 0 0 « Fig. 3. ‘Customer–product preference level matrix’ for CF: conventional vs. proposed recommender systems. 384 Y.S. Kim et al. / Expert Systems with Applications 28 (2005) 381–393
Y.S. Kim et al. Expert Sys Applications 28(2005)381-393 Table 6 Basket placement vs. purchase status Length of reading time: results of f-test(significance level=0.05) Basket No basket N Mean Std dev Std Err Pr>rl placement No purch 61.35 144.5978 团 Purchase 380.57500.30110016400303 3. 2. Descriptive statistics: Phase I No purchase21270.53780.291700063 The influence of the navigational and behavioral patterns turns out to be 0.316(=89/282). These results also confirm of customers on the product purchase is first analyzed. The our intuition that the more frequently is a product visited, is, the relationship between the purchase status and each of the higher becomes the probability of its being purchased the other parameters is evaluated as shown in Tables 3-8 Table 6 compares the average reading times of the The probability of purchase after basket placement purchased and not purchased products. If a product is visited (i.e. p) is calculated as 0.82(=338/412)(see Table 3). more than once, the reading times for all visits are summed This probability is relatively high, which confirms the up for the product, and therefore, the total length of reading results of the previous studies(Lee et al., 2000, 2001). As time for a product increases as more visits described in Phase II of the proposed approach, the This was done to verify the hypothesis that customers would preference level of the product which is placed in the take his or her time to carefully read the detailed description basket but not purchased is set to 0.82 of a product before purchasing. A t-test with unequal Table 4 shows the relationship between the product click variances is performed since the hypothesis of equal type and product purchase status. When a product is clicked variances in samples 'purchase and 'no purchase after being searched by a customer, the probability of its is rejected at the 5% significance level. The result of the ing purchased is estimated as 0.316(=218/689). t-test shows that the difference between the average reading However,when a product is clicked after browsing through times of the purchased and not purchased products is the site, it is only 0.068(=120/1776). Based on these statistically significant at the 5% significance level(see the results, we may conclude that the products clicked after value of the least significance probability, Pr>ItD), from earching have higher preference levels than the ones which we may infer that a longer reading time may indicate clicked after browsing a higher probability of purchase. L, Table 5 presents the relationship between the number of Table 7 shows the hypothesis test results on the visits and product purchase status. The probability difference between the average Level 1(genre)click ratios of purchasing a product after the first click is 0.076 for the purchased and not purchased CD's. Similarly, 136/1800). After the second click, it becomes 0. 29: Table 8 shows the hypothesis test results for the Level 2 (=113/383). For the case where the web page for a certain (specific type)click ratios CD is clicked more than twice, the probability of purchase In the case of Level I click ratios, a t-test with equal Table 4 variances is used since the hypothesis of equal variances is Click type vs. purchase status not rejected at the 5% significance level. However, in the case of Level 2 click ratios, a t-test with unequal variances is Product clicked Product clicked Total through searching through browsing used since the hypothesis of equal variances is rejected at the 5% significance level. No purchase The results in Tables 7 and 8 indicate that the means of Tota he Level l or Level 2 click ratios for the purchased and not purchased products are statistically different at the 5%o significance level. It is also noticed from the least Table 5 Number of visits vs. purchase status Level 2(specific type) click ratio: result of I-test(significant level=0.05) 2 visits visits Total 0.36660.29530.0161 0.0001 2465 No purchase 2127 0.27900.24210.0052
Since there were very few cases of printing or bookmarking, these parameters were excluded in the subsequent analyses. 3.2. Descriptive statistics: Phase I The influence of the navigational and behavioral patterns of customers on the product purchase is first analyzed. That is, the relationship between the purchase status and each of the other parameters is evaluated as shown in Tables 3–8. The probability of purchase after basket placement (i.e. p) is calculated as 0.82 (Z338/412) (see Table 3). This probability is relatively high, which confirms the results of the previous studies (Lee et al., 2000, 2001). As described in Phase II of the proposed approach, the preference level of the product which is placed in the basket but not purchased is set to 0.82. Table 4 shows the relationship between the product click type and product purchase status. When a product is clicked after being searched by a customer, the probability of its being purchased is estimated as 0.316 (Z218/689). However, when a product is clicked after browsing through the site, it is only 0.068 (Z120/1776). Based on these results, we may conclude that the products clicked after searching have higher preference levels than the ones clicked after browsing. Table 5 presents the relationship between the number of visits and product purchase status. The probability of purchasing a product after the first click is 0.076 (Z136/1800). After the second click, it becomes 0.295 (Z113/383). For the case where the web page for a certain CD is clicked more than twice, the probability of purchase turns out to be 0.316 (Z89/282). These results also confirm our intuition that the more frequently is a product visited, the higher becomes the probability of its being purchased. Table 6 compares the average reading times of the purchased and not purchased products. If a product is visited more than once, the reading times for all visits are summed up for the product, and therefore, the total length of reading time for a product increases as more visits are made. This was done to verify the hypothesis that customers would take his or her time to carefully read the detailed description of a product before purchasing. A t-test with unequal variances is performed since the hypothesis of equal variances in samples ‘purchase’ and ‘no purchase’ is rejected at the 5% significance level. The result of the t-test shows that the difference between the average reading times of the purchased and not purchased products is statistically significant at the 5% significance level (see the value of the least significance probability, PrOjtj), from which we may infer that a longer reading time may indicate a higher probability of purchase. Table 7 shows the hypothesis test results on the difference between the average Level 1 (genre) click ratios for the purchased and not purchased CD’s. Similarly, Table 8 shows the hypothesis test results for the Level 2 (specific type) click ratios. In the case of Level 1 click ratios, a t-test with equal variances is used since the hypothesis of equal variances is not rejected at the 5% significance level. However, in the case of Level 2 click ratios, a t-test with unequal variances is used since the hypothesis of equal variances is rejected at the 5% significance level. The results in Tables 7 and 8 indicate that the means of the Level 1 or Level 2 click ratios for the purchased and not purchased products are statistically different at the 5% significance level. It is also noticed from the least Table 4 Click type vs. purchase status Product clicked through searching Product clicked through browsing Total Purchase 218 120 338 No purchase 471 1656 2127 Total 689 1776 2465 Table 3 Basket placement vs. purchase status Basket placement No basket placement Total Purchase 338 0 338 No purchase 74 2053 2127 Total 412 2053 2465 Table 5 Number of visits vs. purchase status 1 visit 2 visits 3 or more visits Total Purchase 136 113 89 338 No purchase 1664 270 193 2127 Total 1800 383 282 2465 Table 7 Level 1 (genre) click ratio: result of t-test (significance levelZ0.05) N Mean Std dev. Std err. PrOjtj Purchase 338 0.5750 0.3011 0.0164 0.0303 No purchase 2127 0.5378 0.2917 0.0063 Table 8 Level 2 (specific type) click ratio: result of t-test (significant levelZ0.05) N Mean Std dev. Std err. PrOjtj Purchase 338 0.3666 0.2953 0.0161 !0.0001 No purchase 2127 0.2790 0.2421 0.0052 Table 6 Length of reading time: results of t-test (significance levelZ0.05) N Mean Std dev Std Err PrOjtj Purchase 338 61.35 144.59 7.86 !0.0001 No purchase 2127 27.31 67.54 1.46 Y.S. Kim et al. / Expert Systems with Applications 28 (2005) 381–393 385
Y.S. Kim et al. Expert Systems with Applications 28(2005)381-393 significance probability that the significance of the mean greater than or equal to 0.0467, then the probability that the difference in the Level 2 click ratios is statistically stronger particular product would be placed in the basket is 0.653. than that in the Level 1 click ratios. These test results This figure is then multiplied by p(i.e. the probability of suggest that customers tend to click those products that purchase given basket placement) to obtain 0.535 belong to his or her favorite genres (Level 1)and specific (=0.653 X0.82 ), which is regarded as the preference level types (Level 2) more often and make purchases among of the product under the above-mentioned input variable pattern. The preference levels of the products for the other seven paths in Fig. 4 can be determined in a similar manner 3.3. Determination of preference levels: Phase ll A closer look at Fig 4 reveals that the click type is one of the most significant variables that affect the chance of basket In phase Il, the preference levels of products that are not placement. As shown in Table 4, 'searching'has a higher purchased even after being clicked are calculated based on probability of purchase(or equivalently, basket placement) in Fig. 3(b), a Customer-Product Preference Level Matrix' of visits, length of reading time, and Level 2 click ratio are is constructed in order to conduct Ce. In the matrix. the also important classifiers for basket placement preference levels of the purchased products are set to l, and the preference levels of the products that are not purchased 3.3.2 ann analysis even after basket placement are set to 0. 82(See Section 3. 2) The ANN employed in the present investigation is a To determine the preference levels of the products which is multilayer feedforward network trained by backpropagation clicked at least once but not placed in the basket, the algorithm. The number of hidden layers is set to either I or probability of reaching basket placement is first predicted 2, and, for each hidden layer, the number of hidden neurons using dt ann or lr changes from 3 to 12 to identify a best ANN structure. The learning rate and momentum are set to 0.1 and 0.9, 3.3.. Decision tree analysis respectively. It is known that a low learning rate ensures a For the intended DT analysis, the CART procedure continuous descent on the error surface and a high PSS Answer Tree(Software/SPSS Answer Tree)is used. momentum is able to speed up tr The basket placement status is considered as the target 1994; Yeh, Hamey, Westcott, 1998), and the above le and the click type, number of visits, length of values are typically used in ann training (Ting, Yunus, reading time, Level l click ratio, and Level 2 click ratio as Salleh, 2002). input variables. In the CART procedure, the maximum As in the dt analysis, the basket placement status is allowable depth is set to 5, and the pruning rule is set to considered as the target variable and the click type, number minimum risk The resulting decision tree is shown of visits, length of reading time, Level I click ratio, and Level 2 click ratio as five input variables. The whole data From Fig. 4, the probabilities of basket placement are randomly divided into two sets. That is, the training set through different paths can be calculated. For instance, if consists of 70% of the data while the rest is assigned to the a product is clicked through searching, the number of visits test set. In addition, training an ANN is stopped using the to the product is 2 or more, and the Level 2 click ratio is early stopping which is also called the optimal stopping rule 1195%412 Click through searching Click type Click through browsing 19.1%161 Numbet of visits Length of reading time 0131%2g 3歌题 %450 Level 2 ratio 635粥350 total 100% 二鬆盟誘鼢 Fig. 4. Constructed decision tree: probability of reaching basket placement(1=basket placement, 0=no basket placement)
significance probability that the significance of the mean difference in the Level 2 click ratios is statistically stronger than that in the Level 1 click ratios. These test results suggest that customers tend to click those products that belong to his or her favorite genres (Level 1) and specific types (Level 2) more often and make purchases among them. 3.3. Determination of preference levels: Phase II In phase II, the preference levels of products that are not purchased even after being clicked are calculated based on all cases defined in Section 2.1 (also see Table 2) As shown in Fig. 3(b), a ‘Customer–Product Preference Level Matrix’ is constructed in order to conduct CF. In the matrix, the preference levels of the purchased products are set to 1, and the preference levels of the products that are not purchased even after basket placement are set to 0.82 (See Section 3.2). To determine the preference levels of the products which is clicked at least once but not placed in the basket, the probability of reaching basket placement is first predicted using DT, ANN or LR. 3.3.1. Decision tree analysis For the intended DT analysis, the CART procedure in SPSS Answer Tree (Software/SPSS AnswerTree) is used. The basket placement status is considered as the target variable and the click type, number of visits, length of reading time, Level 1 click ratio, and Level 2 click ratio as input variables. In the CART procedure, the maximum allowable depth is set to 5, and the pruning rule is set to ‘minimum risk’. The resulting decision tree is shown in Fig. 4. From Fig. 4, the probabilities of basket placement through different paths can be calculated. For instance, if a product is clicked through searching, the number of visits to the product is 2 or more, and the Level 2 click ratio is greater than or equal to 0.0467, then the probability that the particular product would be placed in the basket is 0.653. This figure is then multiplied by p (i.e. the probability of purchase given basket placement) to obtain 0.535 (Z0.653!0.82), which is regarded as the preference level of the product under the above-mentioned input variable pattern. The preference levels of the products for the other seven paths in Fig. 4 can be determined in a similar manner. A closer look at Fig. 4 reveals that the click type is one of the most significant variables that affect the chance of basket placement. As shown in Table 4, ‘searching’ has a higher probability of purchase (or equivalently, basket placement) than browsing. Furthermore, such variables as the number of visits, length of reading time, and Level 2 click ratio are also important classifiers for basket placement. 3.3.2. ANN analysis The ANN employed in the present investigation is a multilayer feedforward network trained by backpropagation algorithm. The number of hidden layers is set to either 1 or 2, and, for each hidden layer, the number of hidden neurons changes from 3 to 12 to identify a best ANN structure. The learning rate and momentum are set to 0.1 and 0.9, respectively. It is known that a low learning rate ensures a continuous descent on the error surface and a high momentum is able to speed up training process (Sarle, 1994; Yeh, Hamey, & Westcott, 1998), and the above values are typically used in ANN training (Ting, Yunus, & Salleh, 2002). As in the DT analysis, the basket placement status is considered as the target variable and the click type, number of visits, length of reading time, Level 1 click ratio, and Level 2 click ratio as five input variables. The whole data are randomly divided into two sets. That is, the training set consists of 70% of the data while the rest is assigned to the test set. In addition, training an ANN is stopped using the early stopping which is also called the optimal stopping rule Fig. 4. Constructed decision tree: probability of reaching basket placement (1Zbasket placement, 0Zno basket placement). 386 Y.S. Kim et al. / Expert Systems with Applications 28 (2005) 381–393
Y.S. Kim et al. Expert Systems with Applications 28(2005)381-393 Table g Predicted values of the response variable are regarded as Misclassification error rates for s single-hidden-layer ANN he probabilities of basket placement for all products which Number of hidden neurons Misclassification error rate are clicked but not placed in the basket. The preference level 0.123 of the product is then determined as the predicted value 0.118 being multiplied by p(i.e the probability of purchase given 0.124 basket placement) 0.116 0.143 3.4. CF and results of recommendation: Phases ll and /V 89012 0.116 0.112 0.120 3.4.1. Procedures for performance evaluation 0.128 The following procedure is used for the performance evaluation of the proposed as well as the convention Sarle, 1995). Under this rule, the error in the test set is also methods computed at the same time as the ANN is being trained, and (1) In the Customer-Product Preference Level Matrix. 5 or the training is stopped once the error in the test set increases. 10% of the cells with a value of l are randomly selected The results of a series of computational experiments and regarded as blank cells(refer to Fig. 3(a) and(b) (2) The preference levels for these blank cells are estimated by the proposed and conventional methods using CF. error rate attains its minimum when a two-hidden-layer ()A Top-N list is generated for each customer who has ann is used with the numbers of neurons in the first and blank cells in Step 1. N is varied from 5 to 30 in second hidden layers being 8 and 6, respectively. This ncrements of 5 in this study structure is adopted in the present study to predict the robability of basket placement for a product under (4)The hidden products (i.e. the cells considered as blanks navigational and behavioral patterns described by the input in Step 1) for each customer are checked to see if they are included in the list variables. All products that are clicked but not placed in the basket have predicted values ranging from 0 to 1. These In Step 3, the Top-N list includes the products of the first predicted values are regarded as the probabilities of basket N highest preference levels. Then, the performance of the placement. The preference level of the product is then proposed or conventional recommender system is evaluated determined as the predicted value being multiplied by p (i.e. by determining how effective the Top-N list is in finding the the probability of purchase given basket placement) hidden products. A similar evaluation procedure was used in the previous studies(e. g. see Sarwar et al., 2000) 3.3.3. Logistic regression analysis LR analysis is performed using SAS Enterprise Miner. 3.4.2. Collaborative filtering The basket placement status is considered as a binary For the proposed as well as the conventional approach, response variable while the click type, number of visits, CF is performed using the Customer-Product Preference length of reading time, Level I click ratio, and Level 2 click Level Matrix in which 5 or 10%o of actual purchase records ratio are regarded as the predictors. In addition, the stepwise are intentionally hidden. Note, however, that the Customer- procedure is used for variable selection, and the selected Product Preference Level Matrix for the conventional variables include click type, number of visits, and Level 2 approach consists of binary purchase data, while that for click ratio(see Table 11) he proposed approach consists of preference levels Table 10 n error rates for two-hidden-layer ANN First hidden layer 0.123 0.120 0.129 0.120 0.126 345678 0.1 0.116 0.112 0.108 0.124 0.112 0.124 0.119 0.127 0.115 0.122 131 0.1 0.126 0.1 0.128 0.116 0.122 0.140 0.124 0.112 0.112 0.120 0.127 0.119 0.115 0.130 0.116 0.118 0.126 0.130 0.130 0.122 0.118 0.114 0.109 0.124 0.122 0.120 0.123 0.140 0.119 0.124 0.116 012 0.112 0.118 0.118 0.114 0.11 0.124 0.127 0.123 0.109 0.120 0.126 0.131 0.114 0.120 0.118 0.134 0.120 0.124 0.112 0.128 0.120 0.135 0.130 0.11 0.127 0.128 0.132 0.123 0.111 0.118
(Sarle, 1995). Under this rule, the error in the test set is also computed at the same time as the ANN is being trained, and the training is stopped once the error in the test set increases. The results of a series of computational experiments using ‘SAS Enterprise Miner’ (Software/SAS Enterprise Miner) are shown in Tables 9 and 10. The misclassification error rate attains its minimum when a two-hidden-layer ANN is used with the numbers of neurons in the first and second hidden layers being 8 and 6, respectively. This structure is adopted in the present study to predict the probability of basket placement for a product under the navigational and behavioral patterns described by the input variables. All products that are clicked but not placed in the basket have predicted values ranging from 0 to 1. These predicted values are regarded as the probabilities of basket placement. The preference level of the product is then determined as the predicted value being multiplied by p (i.e. the probability of purchase given basket placement). 3.3.3. Logistic regression analysis LR analysis is performed using ‘SAS Enterprise Miner’. The basket placement status is considered as a binary response variable while the click type, number of visits, length of reading time, Level 1 click ratio, and Level 2 click ratio are regarded as the predictors. In addition, the stepwise procedure is used for variable selection, and the selected variables include click type, number of visits, and Level 2 click ratio (see Table 11). Predicted values of the response variable are regarded as the probabilities of basket placement for all products which are clicked but not placed in the basket. The preference level of the product is then determined as the predicted value being multiplied by p (i.e. the probability of purchase given basket placement). 3.4. CF and results of recommendation: Phases III and IV 3.4.1. Procedures for performance evaluation The following procedure is used for the performance evaluation of the proposed as well as the conventional methods (1) In the Customer–Product Preference Level Matrix, 5 or 10% of the cells with a value of 1 are randomly selected and regarded as blank cells (refer to Fig. 3(a) and (b)). (2) The preference levels for these blank cells are estimated by the proposed and conventional methods using CF. (3) A Top-N list is generated for each customer who has blank cells in Step 1. N is varied from 5 to 30 in increments of 5 in this study. (4) The hidden products (i.e. the cells considered as blanks in Step 1) for each customer are checked to see if they are included in the list. In Step 3, the Top-N list includes the products of the first N highest preference levels. Then, the performance of the proposed or conventional recommender system is evaluated by determining how effective the Top-N list is in finding the hidden products. A similar evaluation procedure was used in the previous studies (e.g. see Sarwar et al., 2000). 3.4.2. Collaborative filtering For the proposed as well as the conventional approach, CF is performed using the Customer–Product Preference Level Matrix in which 5 or 10% of actual purchase records are intentionally hidden. Note, however, that the Customer– Product Preference Level Matrix for the conventional approach consists of binary purchase data, while that for the proposed approach consists of preference levels. Table 9 Misclassification error rates for single-hidden-layer ANNs Number of hidden neurons Misclassification error rate 3 0.123 4 0.118 5 0.124 6 0.116 7 0.143 8 0.126 9 0.116 10 0.112 11 0.120 12 0.128 Table 10 Misclassification error rates for two-hidden-layer ANNs Second hidden layer First hidden layer 3 4 5 6 7 8 9 10 11 12 3 0.123 0.115 0.120 0.129 0.124 0.134 0.118 0.116 0.120 0.126 4 0.138 0.135 0.116 0.112 0.120 0.108 0.118 0.116 0.118 0.124 5 0.112 0.127 0.115 0.124 0.119 0.118 0.142 0.127 0.118 0.115 6 0.122 0.131 0.124 0.126 0.116 0.107 0.120 0.130 0.128 0.116 7 0.122 0.140 0.124 0.112 0.112 0.120 0.124 0.127 0.132 0.119 8 0.115 0.128 0.130 0.116 0.118 0.126 0.130 0.130 0.122 0.118 9 0.114 0.109 0.124 0.122 0.120 0.123 0.140 0.119 0.124 0.116 10 0.112 0.118 0.118 0.114 0.115 0.124 0.127 0.123 0.109 0.120 11 0.126 0.131 0.114 0.120 0.118 0.134 0.120 0.124 0.112 0.128 12 0.120 0.135 0.130 0.115 0.127 0.128 0.132 0.123 0.111 0.118 Y.S. Kim et al. / Expert Systems with Applications 28 (2005) 381–393 387
388 Y.S. Kim et al. Expert Systems with Applications 28(2005)381-393 Table vectors. Previous studies(e. g Breese, Heckerman, Kadie Results of logistic regression with stepwise procedure 1998)have empirically shown that PC is superior to the Parameter Estimate Standard Chi-Square Pr> ChiSq cosine vector. Therefore, the latter is not considered in the Error present investigation 0489003852 <.0001 (4)Jaccard coefficient for binary data (JC)(Hand Click Type <.0001 Mannila,& Smyth, 2001) amber of 163510.2288 51.0577 <.0001 m1+n10+m1 Ratio Significance level: entering=0.05; staying=0.05. JC can used to measure the similarity of two customers a d b when their preferences are represented by a binary valued variable. /1l, n1o and nol, respectively, denote the CFinvolves the formation of a neighborhood and making total number of products which are commonly purchased predictions. The main goal of neighborhood formation is to by a and b, total number of commonly clicked products customers for each customer. The proximity which are purchased by a but not b, and"total number of measures frequently used to determine the neighborhood a customer are as follows commonly clicked products which are purchased by b but (1)Pearson correlation coefficient(PC)(Resnick et al 1994) To measure the proximity in conducting CF, the Tollowing four measures are considered for comparison in ∑(a-Fa)(rbi;-) his article: (i) Pearson correlation(PC);(ii)constrained Pearson correlation with median as the reference value V∑(m-元)∑/b- (CPC_m); (iii)constrained Pearson correlation with average as the reference value(CPC_a); and (iv) Jaccard coefficient PCab measures the similarity of two customers a and b, and is based on their preferences for the commonly clicked (C). JC is used only for binary purchase data in the present study For CPC, the sample median(m)and mean(a)of the products. Tajand rbj, respectively, represent the preference preference levels for all customers and products are used levels of customers a and b for the commonly clicked (see Table 12) product j. In addition, Fa and Fb, respectively, denote the average values of customer a's and b's preference levels for c,. After computing proximity measures for all pairs of customers, a neighborhood of size k is formed for a all commonly clicked products particular customer by selecting the k nearest custome (2)Constrained Pearson correlation coefficient( CPC) based on the values of a proximity measure. To assess the Shardanand Maes, 1995) effect of the size of the neighborhood on the prediction ∑ra-v)(-v) accuracy, k is varied to be 3, 5, and 10 in the present stud After the neighborhood formation process, the prefer ∑a-y2∑(hy-y) ence levels of each customer for the products not clicked are predicted. Letj be a product not clicked by customer a, and CPC is similar to PC except that CPC is based on v, define Nai as the set of customers who are in the which is the midpoint of the scale(Shardanand Maes neighborhood of customer a and click product j. As 1995). That is, PC only measures the linear tendency, while mentioned earlier, for a customer in Nai, the preference CPC measures not only the linear tendency, but also the level for product is either 1 if purchased or is determined location of the preference levels of two customers with using dt, LR, or anN otherwise. respect to a reference value v. In the present study, the When PC is used in the neighborhood formation process, preference level ranges from 0 to l, and v becomes 0.5 for the customer as preference level for product j is predicted CPC. However, the preference levels predicted using DT, LR, or ANN for the experimental data are not well scattered around 0.5, and therefore, the sample median or mean of the p =i.+ lien, PCai PCai(ri-Fi) preference levels for all customers and products are instead used for (3)Cosine vector Table 12 Reference values for CPC cos(a, b)=- Binary pur- Preference data(proposed la‖×‖ chase da Decision ANN (conventional In the case of the cosine vector, the proximity between egression M(median) 0.0000 two customer preference level vectors a and b is measured A(average)0.1371 0.0874 by computing the cosine of the angle between the two
CF involves the formation of a neighborhood and making predictions. The main goal of neighborhood formation is to find similar customers for each customer. The proximity measures frequently used to determine the neighborhood of a customer are as follows. (1) Pearson correlation coefficient (PC) (Resnick et al., 1994) PCab Z Pjðraj KraÞðrbj KrbÞ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pjðraj KraÞ 2 Pjðrbj KrbÞ 2 q PCab measures the similarity of two customers a and b, and is based on their preferences for the commonly clicked products. raj and rbj, respectively, represent the preference levels of customers a and b for the commonly clicked product j. In addition, ra and rb, respectively, denote the average values of customer a’s and b’s preference levels for all commonly clicked products. (2) Constrained Pearson correlation coefficient (CPC) (Shardanand & Maes, 1995) CPCab Z Pjðraj KvÞðrbj KvÞ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pjðraj KvÞ 2 Pjðrbj KvÞ 2 q CPC is similar to PC except that CPC is based on v, which is the midpoint of the scale (Shardanand & Maes, 1995). That is, PC only measures the linear tendency, while CPC measures not only the linear tendency, but also the location of the preference levels of two customers with respect to a reference value v. In the present study, the preference level ranges from 0 to 1, and v becomes 0.5 for CPC. However, the preference levels predicted using DT, LR, or ANN for the experimental data are not well scattered around 0.5, and therefore, the sample median or mean of the preference levels for all customers and products are instead used for v. (3) Cosine vector cosða~; ~ bÞ Z a~$b~ ka~k!kb~k In the case of the cosine vector, the proximity between two customer preference level vectors a~ and b~ is measured by computing the cosine of the angle between the two vectors. Previous studies (e.g. Breese, Heckerman, & Kadie, 1998) have empirically shown that PC is superior to the cosine vector. Therefore, the latter is not considered in the present investigation. (4) Jaccard coefficient for binary data (JC) (Hand, Mannila, & Smyth, 2001) JCab Z n11 n11 Cn10 Cn01 JC can used to measure the similarity of two customers a and b when their preferences are represented by a binaryvalued variable. n11, n10 and n01, respectively, denote the ‘total number of products which are commonly purchased by a and b’, ‘total number of commonly clicked products which are purchased by a but not b’, and ‘total number of commonly clicked products which are purchased by b but not a’. To measure the proximity in conducting CF, the following four measures are considered for comparison in this article: (i) Pearson correlation (PC); (ii) constrained Pearson correlation with median as the reference value (CPC_m); (iii) constrained Pearson correlation with average as the reference value (CPC_a); and (iv) Jaccard coefficient (JC). JC is used only for binary purchase data in the present study. For CPC, the sample median (m) and mean (a) of the preference levels for all customers and products are used (see Table 12). After computing proximity measures for all pairs of customers, a neighborhood of size k is formed for a particular customer by selecting the k nearest customers based on the values of a proximity measure. To assess the effect of the size of the neighborhood on the prediction accuracy, k is varied to be 3, 5, and 10 in the present study. After the neighborhood formation process, the preference levels of each customer for the products not clicked are predicted. Let j be a product not clicked by customer a, and define Naj as the set of customers who are in the neighborhood of customer a and click product j. As mentioned earlier, for a customer in Naj, the preference level for product j is either 1 if purchased or is determined using DT, LR, or ANN otherwise. When PC is used in the neighborhood formation process, the customer a’s preference level for product j is predicted as Paj Zra C Pi2Naj PCaiðrij KriÞ Pi2Naj PCai ; Table 11 Results of logistic regression with stepwise procedure Parameter Estimate Standard Error Chi-Square Pr O ChiSq Intercept K2.6602 0.1280 431.7432 !.0001 Click Type K0.9840 0.0623 249.4712 !.0001 Number of Visits 0.4339 0.0453 91.5328 !.0001 Level 2 Ratio 1.6351 0.2288 51.0577 !.0001 Significance level: enteringZ0.05; stayingZ0.05. Table 12 Reference values for CPC Binary purchase data (conventional) Preference data (proposed) Decision tree ANN Logistic regression M (median) 0.0000 0.0780 0.0614 0.0874 A (average) 0.1371 0.1671 0.2514 0.1671 388 Y.S. Kim et al. / Expert Systems with Applications 28 (2005) 381–393
Y.S. Kim et al. Expert Systems with Applications 28(2005)381-393 where Fa and Fi denote the average values of customer as vs. proposed), preference prediction method for the and i's preference levels, respectively, for all clicked proposed approach, similarity measure, number of rec products, and ri is the customer is preference level for ommended products (N), percentage of actually purchased product j(Resnick et al., 1994). When CPC is used, on the products hidden, and neighborhood size(k). From Table 13, other hand, the weighted average of the preference levels for we observed the following. all customers in Nai is computed for Paj as follows Similarity measure CPC_a or CPC_m performs consist (Shardanand maes, 1995) ently better than PC and/or JC for both approaches. In ∑ ien CPCai addition, if we consider CPC_a or CPC_m only, then the proposed approach outperforms the conventional approach in almost all cases considered. It is also observed from Finally, when JC is used, the preference level is predicted as Table size(k) for the k values covered in the experiment. To assess the effects of various parameters on Fl in a more succinct manner, the analysis of variance(ANOVA) technique is applied to the experimental data for each Once Pas are predicted, the products corresponding to approach. The experimental setting for each approach may N highest Pais are recommended to customer a In th be regarded as a full factorial design(Montgomery, 2000) present study, N is varied from 5 to 30 in increments of 5 to For instance, factors for the proposed approach include assess its effect on the prediction accuracy preference predicti method(denoted by P with three levels of DT, ANN, and LR), similarity measure(denoted 343. Evaluation metrics by S with three levels of PC, CPC_m and CPC_a), N(with 6 In order to evaluate the performance of a recommender levels of 5, 10.,..., 30), percentage of actually purchased system, the so called 'recalland'precision'are frequently products hidden( denoted by H with 2 levels of 5 and 10%) used in the field of information retrieval (Sarwar et al and neighborhood size k(with three levels of 3, 5, and 10) 2000)They are respectively defined as As mentioned earlier, Fl values are insensitive to k, and recall H∩TopM therefore. k is not considered as a factor and is fixed at 3 in the subsequent analyses Table 14 shows the anova table for the conventional approach. The three-way interaction effect (i.e. NXHXS)is precis. N-al assumed to be negligible. Note that the p-values(or the least significance probabilities) for main effects N, H and S as where well as for interaction effect HXS are 'small, and therefore those effects are considered statistically significant(Mont H hidden products of customer i gomery, 2000). This is also illustrated in Figs. 5 and 6. N total number of recommended products for each which respectively, show the three main effects and customer Top_Ni Top-N list for customer i interaction effect HXS. Since N is not involved in any significant interaction effect, its effect on Fl can be assessed customers who has one or more hidden products independently of the other factors. That is, FI achieves its ue when On the other hand, these two measures are inversely the other hand. effects of h and s on fl need to be assessed related. For instance. as N increases. recall also increases using their main and interaction plots together. In addition, but precision generally decreases. Therefore, a combined the effect of S on FI is of primary concern while that of Hon measure Fl is defined as the harmonic mean of recall and precision as follows(Sarwar et al., 2000) Fl is of secondary interest since S can be regarded as a design variable for the approach while H is introduced to 2× recall X precision heck the performance consistency of other factors. Note recall+ precision from Fig. 6 that performances of the similarity measures show different patterns depending on H. Nevertherless, A higher value of FI indicates a better performance of a cpc a is superior to other measures regardless of the levels ecommender system In the present study, the performance of H of the proposed or conventional method is evaluated based The ANOVA results for the proposed approach are FI summarized in Table 15 where the p-values indicate that all four main effects and interaction effects p×S,N×H,N×S 3.4.4. Performance of proposed and conventional and HXS are statistically significant(see also Fig. 7 for the recommender systems main effects and Figs. 8-11 for the interaction effects). Main Computational results(ie FI values) are summarized in effect plot for P(see Fig. 7)and interaction plot for PXS Table 13 with respect to the appi oach taken(co (see Fig. 8)show that the performance of LR is better than
where ra and ri denote the average values of customer a’s and i’s preference levels, respectively, for all clicked products, and rij is the customer i’s preference level for product j (Resnick et al., 1994). When CPC is used, on the other hand, the weighted average of the preference levels for all customers in Naj is computed for Paj as follows (Shardanand & Maes, 1995) Paj Z Pi2Naj CPCai$rij Pi2Naj CPCai : Finally, when JC is used, the preference level is predicted as Paj Z Pi2Naj JCai$rij Pi2Naj JCai : Once Paj’s are predicted, the products corresponding to the N highest Paj’s are recommended to customer a. In the present study, N is varied from 5 to 30 in increments of 5 to assess its effect on the prediction accuracy. 3.4.3. Evaluation metrics In order to evaluate the performance of a recommender system, the so called ‘recall’ and ‘precision’ are frequently used in the field of information retrieval (Sarwar et al., 2000) They are respectively defined as recall Z Pi2A jHihTop_Nij Pi2A jHij ; precision Z Pi2A jHihTop_Nij N$jAj where Hi hidden products of customer i N total number of recommended products for each customer Top_Ni Top-N list for customer i A customers who has one or more hidden products. On the other hand, these two measures are inversely related. For instance, as N increases, recall also increases but precision generally decreases. Therefore, a combined measure F1 is defined as the harmonic mean of recall and precision as follows (Sarwar et al., 2000). F1 Z 2!recall!precision recallCprecision : A higher value of F1 indicates a better performance of a recommender system. In the present study, the performance of the proposed or conventional method is evaluated based on F1. 3.4.4. Performance of proposed and conventional recommender systems Computational results (i e. F1 values) are summarized in Table 13 with respect to the approach taken (conventional vs. proposed), preference prediction method for the proposed approach, similarity measure, number of recommended products (N), percentage of actually purchased products hidden, and neighborhood size (k). From Table 13, we observed the following. Similarity measure CPC_a or CPC_m performs consistently better than PC and/or JC for both approaches. In addition, if we consider CPC_a or CPC_m only, then the proposed approach outperforms the conventional approach in almost all cases considered. It is also observed from Table 13 that F1 values are insensitive to the neighborhood size (k) for the k values covered in the experiment. To assess the effects of various parameters on F1 in a more succinct manner, the analysis of variance (ANOVA) technique is applied to the experimental data for each approach. The experimental setting for each approach may be regarded as a full factorial design (Montgomery, 2000). For instance, factors for the proposed approach include: preference prediction method (denoted by P with three levels of DT, ANN, and LR), similarity measure (denoted by S with three levels of PC, CPC_m and CPC_a), N (with 6 levels of 5,10,.,30), percentage of actually purchased products hidden (denoted by H with 2 levels of 5 and 10%) and neighborhood size k (with three levels of 3, 5, and 10). As mentioned earlier, F1 values are insensitive to k, and therefore, k is not considered as a factor and is fixed at 3 in the subsequent analyses. Table 14 shows the ANOVA table for the conventional approach. The three-way interaction effect (i.e. N!H!S) is assumed to be negligible. Note that the p-values (or the least significance probabilities) for main effects N, H and S as well as for interaction effect H!S are ‘small’, and therefore, those effects are considered statistically significant (Mont gomery, 2000). This is also illustrated in Figs. 5 and 6, which respectively, show the three main effects and interaction effect H!S. Since N is not involved in any significant interaction effect, its effect on F1 can be assessed independently of the other factors. That is, F1 achieves its highest value when NZ5, and decreases as N increases. On the other hand, effects of H and S on F1 need to be assessed using their main and interaction plots together. In addition, the effect of S on F1 is of primary concern while that of H on F1 is of secondary interest since S can be regarded as a design variable for the approach while H is introduced to check the performance consistency of other factors. Note from Fig. 6 that performances of the similarity measures show different patterns depending on H. Nevertherless, CPC_a is superior to other measures regardless of the levels of H. The ANOVA results for the proposed approach are summarized in Table 15 where the p-values indicate that all four main effects and interaction effects P!S, N!H, N!S and H!S are statistically significant (see also Fig. 7 for the main effects and Figs. 8–11 for the interaction effects). Main effect plot for P (see Fig. 7) and interaction plot for P!S (see Fig. 8) show that the performance of LR is better than Y.S. Kim et al. / Expert Systems with Applications 28 (2005) 381–393 389
Experiment results: FI values for conventional and proposed approaches N eighbor- Binary purchase data(conventional) Preference data(propose ctual hood size Decision tree Logistic regression JC CPC CPC CPC CPC 4 CPC 0042 0.021 0.021 042 0.042 0.021 1 10 01 0006 0.017 0000000000 0042 10Z 10 0080.0120.00 0000 D000 0.012 0018 0000 10 0.0000002 0.005 0.006 0021 0.003 0.024 5 0.005 10 0.019 0024 0.00 0.02 0017 10 0.000 0.010 0.017 0.005 0.022 0019 5 0.00 01 0.020 0.020 0.016 0.020 0.02 0024 D014 0.004 0.000 0.012 0.008 0D14 0.014 0.018 0.004 0018 0018
Table 13 Experiment results: F1 values for conventional and proposed approaches N % of actual purchases hidden Neighborhood size Binary purchase data (conventional) Preference data (proposed) Decision tree ANN Logistic regression JC PC CPC_m CPC_a PC CPC_m CPC_a PC CPC_m CPC_a PC CPC_m CPC_a 5 5 3 0.021 0.021 0.021 0.021 0.000 0.042 0.021 0.000 0.021 0.042 0.000 0.042 0.042 5 0.021 0.021 0.021 0.021 0.000 0.042 0.021 0.000 0.021 0.042 0.000 0.042 0.042 10 0.021 0.021 0.021 0.021 0.000 0.042 0.021 0.000 0.021 0.042 0.000 0.042 0.042 10 3 0.000 0.010 0.021 0.031 0.010 0.052 0.031 0.010 0.052 0.042 0.010 0.052 0.052 5 0.000 0.010 0.021 0.031 0.010 0.052 0.031 0.010 0.052 0.042 0.010 0.052 0.052 10 0.000 0.010 0.021 0.031 0.010 0.052 0.031 0.010 0.052 0.042 0.010 0.052 0.052 10 5 3 0.011 0.011 0.011 0.011 0.000 0.023 0.011 0.000 0.011 0.034 0.000 0.023 0.023 5 0.011 0.011 0.011 0.011 0.000 0.023 0.011 0.000 0.011 0.034 0.000 0.023 0.023 10 0.011 0.011 0.011 0.011 0.000 0.023 0.011 0.000 0.011 0.034 0.000 0.023 0.023 10 3 0.000 0.006 0.011 0.017 0.006 0.028 0.017 0.006 0.028 0.028 0.006 0.028 0.028 5 0.000 0.006 0.011 0.017 0.006 0.028 0.017 0.006 0.028 0.028 0.006 0.028 0.028 10 0.000 0.006 0.011 0.017 0.006 0.028 0.017 0.006 0.028 0.028 0.006 0.028 0.028 15 5 3 0.008 0.008 0.008 0.008 0.008 0.016 0.016 0.000 0.023 0.031 0.000 0.031 0.031 5 0.008 0.008 0.008 0.008 0.008 0.016 0.016 0.000 0.023 0.031 0.000 0.031 0.031 10 0.008 0.008 0.008 0.008 0.008 0.016 0.016 0.000 0.023 0.031 0.000 0.031 0.031 10 3 0.000 0.004 0.008 0.012 0.004 0.020 0.016 0.004 0.027 0.020 0.004 0.027 0.031 5 0.000 0.004 0.008 0.012 0.004 0.020 0.016 0.004 0.031 0.020 0.004 0.027 0.031 10 0.000 0.004 0.008 0.012 0.004 0.020 0.016 0.004 0.031 0.020 0.004 0.027 0.031 20 5 3 0.006 0.006 0.006 0.012 0.006 0.018 0.018 0.000 0.018 0.024 0.000 0.024 0.024 5 0.006 0.006 0.006 0.012 0.006 0.018 0.018 0.000 0.018 0.024 0.000 0.024 0.024 10 0.006 0.006 0.006 0.012 0.006 0.018 0.018 0.000 0.018 0.024 0.000 0.024 0.024 10 3 0.000 0.002 0.005 0.010 0.006 0.021 0.015 0.003 0.024 0.018 0.003 0.024 0.024 5 0.000 0.002 0.005 0.010 0.006 0.021 0.015 0.003 0.024 0.018 0.003 0.024 0.024 10 0.000 0.002 0.005 0.010 0.006 0.021 0.015 0.003 0.024 0.018 0.003 0.024 0.024 25 5 3 0.005 0.005 0.005 0.014 0.010 0.019 0.019 0.000 0.019 0.024 0.000 0.019 0.019 5 0.005 0.005 0.005 0.014 0.010 0.019 0.019 0.000 0.019 0.024 0.005 0.019 0.019 10 0.005 0.005 0.005 0.014 0.010 0.019 0.019 0.000 0.019 0.024 0.005 0.019 0.019 10 3 0.000 0.002 0.005 0.010 0.007 0.017 0.017 0.005 0.022 0.017 0.005 0.019 0.019 5 0.000 0.002 0.005 0.010 0.007 0.017 0.017 0.005 0.022 0.017 0.005 0.019 0.019 10 0.000 0.002 0.005 0.010 0.010 0.017 0.017 0.005 0.022 0.017 0.005 0.019 0.019 30 5 3 0.004 0.004 0.004 0.012 0.012 0.016 0.020 0.004 0.020 0.024 0.000 0.016 0.024 5 0.004 0.004 0.004 0.016 0.016 0.016 0.020 0.004 0.020 0.024 0.008 0.016 0.024 10 0.004 0.004 0.004 0.016 0.012 0.016 0.020 0.004 0.016 0.028 0.004 0.016 0.024 10 3 0.000 0.002 0.004 0.012 0.006 0.014 0.014 0.006 0.018 0.020 0.006 0.018 0.020 5 0.000 0.002 0.004 0.012 0.006 0.014 0.014 0.006 0.018 0.020 0.004 0.018 0.018 10 0.000 0.002 0.004 0.012 0.008 0.014 0.014 0.006 0.018 0.020 0.004 0.018 0.018 Y.S. Kim et al. / Expert Systems with Applications 28 (2005) 381–393 390