ARTICLE IN PRESS Engineering Applications of Artificial Intelligence i(l)Ill-lll Contents lists available at science Direct Artificial Engineering Applications of Artificial Intelligence ellgence ELSEVIER journalhomepagewww.elsevier.com/locate/engappai An improvement for semantics-based recommender systems grounded on attaching temporal information to ontologies and user profiles x Yolanda blanco-Fernandez, Martin Lopez-Nores, Jose. Pazos-Arias, Jorge Garcia-Duque ETSE Telecomunicacion, Campus Universitario, vigo. 36310, Spain ARTICLE INFO ABSTRACT Recommender systems in online shopping automatically select the most appropriate items to each user, thus shortening his/her product searching time in the shops and adapting the selection as his/he particular preferences evolve over time. This adaptation process typically considers that a users 24 February 2011 Accepted 25 February 2011 nterest in a given type of product always decreases with time from the moment of the last purchase. However, the necessity of a product for a user depends on both the nature of the own item and the ersonal preferences of the user, being even possible that his/her interest increases over time from the purchase. Some existing approaches focus only on the first factor, missing the point that the influence of time can be very different for different users. To solve this limitation, we present a filtering strategy that exploits the semantics formalized in an ontology in order to link items(and their features) to time functions. The novelty lies within the fact that the shapes of these functions are corrected by gy temporal curves built from the consumption stereotypes into which each user fits best. Our preliminary xperiments involving real users have revealed significant improvements of recommendation precision with regard to previous time-driven filtering approaches. c 2011 Elsevier Ltd. All rights reserved Discovering products that meet the needs of the consumers Obviously, keeping the users'satisfad re crucial in such competitive environments as online shopping. to adapt the selection of items as their interests evolve over time. Recommender systems assist in advertising tasks by automati- For many years, in most of the existing filtering strategies, data cally selecting the most appropriate items for each user as per hi her personal interests and preferences(Adomavicius and Tuzhilin, process, weighing equally the ratings given by the users at 2005). Research in recommender systems started back in the different times. Later, some researchers proposed time-aware early 1990s, but the greatest advances have been due to the approaches that made the last observations more significant than irruption of recent technologies like those of the Semantic Web the older ones, which means assuming that a user's interest in a (Berners-Lee et al, ) It has been proved that semantics-based product always decreases from the moment of the last purchase recommender systems can outperform previous approaches by (see examples in Maloof and Michalski, 2000: Schwab et al, 2001 exploiting two main elements: Duen-Ren and Ya-Yueh, 2005: Ding and Li, 2005: Lee and Park, 2009). This may be true in certain areas of application, such as a knowledge base typically an ontology that represents personalized programming guides that recommend Tv programs semantic features or attributes of the available items and to the users. Notwithstanding, the interest in (or the need for) filtering strategies based on semantic reasoning techniques that commercial products in general may actually increase or vary in discover relevant relationships between the users'preferences diverse forms over time. For example, if a user has just bought a and the items to be recommended (see examples in Hung, 2005 dishwasher, it is foreseeable that he/she will not need another Middleton et al, 2004: Yuan and Cheng, 2004: Blanco-Fernandez one until the average lifetime of such appliances has passed; et al. 2008: Pazos-Arias Jose et al. 2008: Blanco-Fernandez therefore, the interest estimations should follow an increasing et al. 2010). function, and any recommender system should prioritize other may vary along the year, while the interest in books and music nay remain constant and school equipment may have a peak funded by the ministerio de educacion y Ciencia(Gobierno de espana he beginning of the academic yeal sponding author Fax: +34 986812116 The main research contribution of this paper is an improve- address: yolanda@det vigo. es(Y. Blanco-Fernandez). ment to the current filtering strategies, aimed at increasing the 0952-1976s- er e 2011 Elsevier Ltd. All rights reserved. Please cite this article as: Blanco-Fernandez, Y, et al, An improvement for semantics-based recommender systems grounded on attaching temporal information. Engineering Applications of Artificial Intelligence (2011). doi: 10.1016/.engappai 2011.02.020
An improvement for semantics-based recommender systems grounded on attaching temporal information to ontologies and user profiles$ Yolanda Blanco-Ferna´ndez , Martı´n Lo´pez-Nores, Jose´ J. Pazos-Arias, Jorge Garcı´a-Duque ETSE Telecomunicacio´n, Campus Universitario, Vigo. 36310, Spain article info Article history: Received 15 June 2010 Received in revised form 24 February 2011 Accepted 25 February 2011 Keywords: Personalization Recommender systems Semantic reasoning Time-aware filtering Ontology Consumption stereotypes abstract Recommender systems in online shopping automatically select the most appropriate items to each user, thus shortening his/her product searching time in the shops and adapting the selection as his/her particular preferences evolve over time. This adaptation process typically considers that a user’s interest in a given type of product always decreases with time from the moment of the last purchase. However, the necessity of a product for a user depends on both the nature of the own item and the personal preferences of the user, being even possible that his/her interest increases over time from the purchase. Some existing approaches focus only on the first factor, missing the point that the influence of time can be very different for different users. To solve this limitation, we present a filtering strategy that exploits the semantics formalized in an ontology in order to link items (and their features) to time functions. The novelty lies within the fact that the shapes of these functions are corrected by temporal curves built from the consumption stereotypes into which each user fits best. Our preliminary experiments involving real users have revealed significant improvements of recommendation precision with regard to previous time-driven filtering approaches. & 2011 Elsevier Ltd. All rights reserved. 1. Introduction Discovering products that meet the needs of the consumers are crucial in such competitive environments as online shopping. Recommender systems assist in advertising tasks by automatically selecting the most appropriate items for each user as per his/ her personal interests and preferences (Adomavicius and Tuzhilin, 2005). Research in recommender systems started back in the early 1990s, but the greatest advances have been due to the irruption of recent technologies like those of the Semantic Web (Berners-Lee et al.,). It has been proved that semantics-based recommender systems can outperform previous approaches by exploiting two main elements: a knowledge base – typically an ontology – that represents semantic features or attributes of the available items, and filtering strategies based on semantic reasoning techniques that discover relevant relationships between the users’ preferences and the items to be recommended (see examples in Hung, 2005; Middleton et al., 2004; Yuan and Cheng, 2004; Blanco-Ferna´ndez et al., 2008; Pazos-Arias Jose´ et al., 2008; Blanco-Ferna´ndez et al., 2010). Obviously, keeping the users’ satisfaction high requires means to adapt the selection of items as their interests evolve over time. For many years, in most of the existing filtering strategies, data collection about the users’ interests was regarded as a static process, weighing equally the ratings given by the users at different times. Later, some researchers proposed time-aware approaches that made the last observations more significant than the older ones, which means assuming that a user’s interest in a product always decreases from the moment of the last purchase (see examples in Maloof and Michalski, 2000; Schwab et al., 2001; Duen-Ren and Ya-Yueh, 2005; Ding and Li, 2005; Lee and Park, 2009). This may be true in certain areas of application, such as personalized programming guides that recommend TV programs to the users. Notwithstanding, the interest in (or the need for) commercial products in general may actually increase or vary in diverse forms over time. For example, if a user has just bought a dishwasher, it is foreseeable that he/she will not need another one until the average lifetime of such appliances has passed; therefore, the interest estimations should follow an increasing function, and any recommender system should prioritize other products for some time. Likewise, the interest for seasonal clothes may vary along the year, while the interest in books and music may remain constant and school equipment may have a peak at the beginning of the academic year. The main research contribution of this paper is an improvement to the current filtering strategies, aimed at increasing the Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/engappai Engineering Applications of Artificial Intelligence 0952-1976/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.engappai.2011.02.020 $Work funded by the Ministerio de Educacio´n y Ciencia (Gobierno de Espan˜a) research project TIN2010-20797. Corresponding author. Fax: þ34 986812116. E-mail address: yolanda@det.uvigo.es (Y. Blanco-Ferna´ndez). Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020 Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]]
ARTICLE IN PRESS fective combine the strengths of content-based and nding items similar to the ones ering two items similar if the att Research in number of pract the issues of It to note that all of were time-unaware, inasmuch ased recommender systems grounded on doi:10.1016j.engappai.2011.02.020
effectiveness of semantics-based recommender systems in online shopping. The basic assumption is that the influence of time can be radically different not only for different types of items as explained above, but also for different users. For instance, whereas car tires typically have a lifetime of 6 years for average drivers, it is expectable that taxi drivers or users interested in car tuning and motor sports need more frequent replacements (say, every 6 months). Analogously, it makes sense not to recommend dolls for some time after an average user has bought one, but the same is not true for doll collectors. Briefly speaking, our new approach makes tailor-made selections of items by exploiting the semantics formalized in an ontology to link items (and their features) to time functions, whose shapes are corrected by considering the preferences of like-minded individuals and the effects of time in their purchasing behaviors. The paper is organized as follows. Section 2 includes a review of recommender systems literature to highlight the differences between the management of time in previous works and in our new filtering strategy. Next, Section 3 details the main parts of our personalization framework, while in Section 4 we focus on the algorithmic internals of our time-aware filtering strategy. Section 5 presents the results of experiments we have carried out (with real users) to assess the personalization quality achieved by the new filtering strategy in comparison with existing approaches. Finally, Section 6 provides a summary of conclusions and the motivation of our ongoing work. 2. Related work Research in recommender systems is hectic nowadays, in an attempt to address the many new questions raised by the growing number of practical applications. Next, we provide an overview of the milestones in recommenders history, and thereafter focus on the issues of producing time-aware recommendations in online shopping, which remain practically unexplored in literature. 2.1. Background on recommender systems Given a set of items, the goal of a recommender system is to identify the most suitable ones according to the information stored in a user’s profile by adopting diverse filtering strategies. The first strategies merely looked at demographic information (e.g. age, gender or marital status) to recommend items that had interested other users with similar data. The results so obtained tend to be imprecise and fail to reflect changes of the user preferences over time (because personal data are often stable for long periods). This problem was addressed by content-based filtering, that looks for items similar to others that gained the user’s interest in the past (Adomavicius and Tuzhilin, 2005; Dias et al., 2008). This strategy is easy to adopt, but bears a problem of overspecialization: the recommendations tend to be repetitive for considering that a user will always appreciate the same kind of items. Furthermore, the limited data available about new users makes the first results highly inaccurate. To tackle these problems, the scientific community came up with collaborative filtering, that proceeds by evaluating not only the profile of the target user (the one who will receive the recommendations), but also those of users with similar interests (his/her neighbors) (Schafer et al., 2001; Montaner et al., 2003). This approach can solve the lack of diversity in the recommendations, but faces problems like the sparsity when the number of items is high (which makes it hard to find users with similar evaluations for the same items) or the treatment given to users whose preferences are dissimilar to the majority (the gray sheep). There exist hybrid approaches that attempt to neutralize the weaknesses and combine the strengths of content-based and collaborative filtering, e.g. recommending items similar to the ones listed in the user’s profile, but considering two items similar if the individuals who show interest in the one tend to be interested in the other (Papagelis and Plexousakis, 2005; Burke, 2002; Li et al., 2005). Both in content-based filtering and collaborative filtering, the user profiles are typically initialized from stereotypes which are mechanisms that provide general descriptions for a set to similar users (Rich, 1979). Actually, stereotypes allow to build models of individual users on the basis of a small amount of information about them (e.g. age, occupation, lifestyle, etc.). As described in Montaner et al. (2003), Shani et al. (2007), Kobsa et al. (2001) and Krulwich (1997), stereotypes have also been widely adopted in diverse filtering strategies for the selection of the most appropriate recommendations for each user. Regardless of the filtering strategy, it is noticeable that most of the recommender systems have relied on syntactic matching techniques, that relate items by looking for common words in their attached metadata. Even though there exist plenty of different approaches, they all miss much knowledge during the personalization process, because they are unable to reason about the meaning of the metadata. A syntactic approach is also a source of overspecialization, because the recommendations so computed can only include items very similar to those the users already know (Adomavicius and Tuzhilin, 2005). To go one step beyond in personalization quality and diversity, research is now focused on applying techniques from the Semantic Web, that allow to gain insight into the meaning of words. The key here lies within the use of ontologies to describe and interrelate items and their attributes by means of class hierarchies and properties (Staab and Studer, 2004). Thus, many authors have enhanced the traditional filtering strategies with semantic reasoning mechanisms, to discover the items that best match the preferences of each user by reasoning about their semantic descriptions. Hung (2005), proposed a recommender system for one-to-one marketing based on a taxonomy of products, revealing the advantages of such semantics when it came to providing instant online recommendations and identifying potential customers upon release of a new product. Middleton et al., 2004 explored a novel ontological approach to user profiling within semantics-based recommender systems, coping with the problem of recommending academic research papers; the experiments showed that profile visualization and feedback outperformed previous user modelling approaches, which led the authors to conclude that the semantics captured by their ontological approach made the profiles easier to understand. The authors of Yuan and Cheng (2004) investigated analogy structures between heterogeneous products (i.e., products with different properties) to recommend items that are disparate from others the users had purchased, using what they called an ontology-driven coupled clustering algorithm. We have previously explored the benefits of semantics-based recommender systems in other domains. In Blanco-Ferna´ndez et al. (2008), we proposed an ontology-driven recommendation system to select the most appealing TV programs for the users. In PazosArias Jose´ et al. (2008), we incorporated a similar semanticsenhanced approach into a t-learning platform to recommend personalized educational courses according to the users’ preferences and previous knowledge. Finally, in Blanco-Ferna´ndez et al. (2010), we exploited the benefits of semantics-driven reasoning in a tourism recommender system. 2.2. Background on time-aware filtering approaches For the purposes of this paper, it is important to note that all of the abovementioned approaches were time-unaware, inasmuch 2 Y. Blanco-Ferna´ndez et al. / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020
ARTICLE IN PRESS Y. Blanco-Ferandez et al. /E they did not include any mechanisms to take into account n a are new time-aware filtering strategy vill be presented in Section 4. concerns in the domain ontology commendation of items in the we need an ontology that f formalizes is domain. The creation of items. e also oute is )to re purchased item class pliances). The the success a function that minded time of purchase. ith the instant when able 1 3.0 d once its lifetime T This section lization framework:the time functions attached to its nodes; the and the stereotypes users; and the group correc Please cite this article a ed recommender systems grounded on attaching temporal information.... E doi:10.1016jengappai.211.02.020
as they did not include any mechanisms to take into account the influence of time on the user’s interests, preferences and needs. The first attempts to consider this effect consisted in introducing gradual forgetting functions, to make the most recent observations more significant than the older ones during the computation of recommendations (Koychev, 2000; Maloof and Michalski, 2000; Schwab et al., 2001; Duen-Ren and Ya-Yueh, 2005). Specifically, when a new item was added to a user’s profile, its weight was set to 1 and the values of the other items were decreased. Most often, this was done as per a constant decay rate, but some authors considered different rates for different item classes and even supplementing recency with other information. Indeed, the authors of Ding and Li (2005) presented an approach to trace changes in the purchase interests of each user and thereby compute personalized decay factor. In Ding et al. (2006), the same authors extended their initial approach by using weights for items based on their expected accuracy on the future preferences and making decisions based on data arrival time. More recently, the authors of Lee and Park (2009) motivated the exploitation of other temporal information to improve the accuracy of a collaborative filtering strategy. Specifically, they presented an approach that involved item launch times (indicating the age of the items), purchase times (denoting the age of the users’ preferences) and the time difference between both (representing the temporal gap between when an item is released and when a user purchases it). Although it is clearly more sophisticated, this approach ultimately comes down to the same assumption of gradual forgetting, i.e., that the interest of the user in an item always decreases from the moment of the last purchase. In Blanco-Ferna´ndez et al. (2008), we presented the first timeaware filtering approach that reckoned the fact that, in general applications of recommender systems (and, particularly, in online shopping), the interest in (or the need for) certain items may actually increase over time. In that paper, we harnessed the conceptualization provided by an ontology to link the classes and attributes of the items to time functions that modelled dependences with regard to absolute dates or purchase times. Although the experimental results showed that this approach could outperform time-unaware filtering proposals (see BlancoFerna´ndez et al., 2008), we guessed that the effectiveness of the recommender system could improve even further by considering that the influence of time can be radically different for different users (recall the motivation examples given in Section 1). Therefore, in this paper, we shall enhance the approach presented in Blanco-Ferna´ndez et al. (2008) to involve not only item classes but also user preferences in modelling time dependences. To this aim, we propose to modify the shape of the default time functions (i.e., the ones defined in the ontology for average users) by means of an adaptive group correction, built from consumption stereotypes that cluster together users who share some of their preferences. We do not consider individual corrections – as they did in Ding and Li (2005) to tailor decay rates – because it would be unfeasible to gather sufficient information from every single user to accurately characterize his/her potential interest in all item classes over time. Instead, it makes more sense to consider the success or failure of the recommendations made to likeminded individuals. 3. Our personalization framework This section describes the main elements of our new personalization framework: the domain ontology and the parameterized time functions attached to its nodes; the individual user profiles and the stereotypes that model the preferences of groups of users; and the group corrections that modify the default time dependence curves. The new time-aware filtering strategy enabled by these elements will be presented in Section 4. 3.1. Including time concerns in the domain ontology Since we are considering the recommendation of items in the scope of online shopping, we need an ontology that formalizes typical concepts and relationships in this domain. The creation of such an ontology is problematic due to the high degree of specificity (that leads to a very large number of concepts) and the need for timely maintenance, owing to the continuous innovations that take place in the domain of products and services. Therefore, we did not intend to create an ontology covering all possible types of items, but rather to use one that could be easily extracted from some of the classification standards available for industrial products and services. To this aim, we looked at standards like UNSPSC,1 eCl@ss,2 eOTD3 and the RosettaNet Technical Dictionary,4 which reflect some level of community consensus and contain multiple definitions of hierarchically organized concepts. Finally, we chose eCl@ss as the main input for creating the domain ontology, due to the reasons of completeness, balance and maintenance discussed in Hepp et al. (2007). More specifically, we have borrowed from eClassOWL – the OWL ontology for products and services developed by Hepp (2006) – many concepts referred to categorizations of commercial products, and we have also defined new properties and classes to accommodate some missing features. Along with the multiple hierarchies of classes that serve to categorize the commercial products and their attributes, the ontology contains labeled properties joining each item to its attributes and seeAlso properties to link strongly related items. The ontology was populated by retrieving information from multiple online retailers. A brief excerpt is depicted in Fig. 1, where classes, items and attributes are denoted by gray ellipses, white squares and white ellipses, respectively. Our new approach to time-aware filtering starts out by associating parameterized time functions to item classes and attributes, in order to model the variation of the potential interest of each type of product or any of its features with regard to absolute dates or purchase times. Most commonly, the specific time function associated to a given item is chosen based on the marketing criteria handled by the providers. We handle functions with diverse shapes, including combinations of constant, linear, exponential, sinusoidal, parabolic, hyperbolic and elliptic segments, with values between 0 and 1. As a rule of thumb, low values are intended to prevent the recommendation of the items, whereas high values are intended to promote them. We also require valid functions to take the maximum value (1) at some point, corresponding to the time an item or an attribute is potentially most interesting for a general audience. Next, we shall exemplify some of the functions (see Table 1) to explain how the time-aware filtering works: Monotonically increasing function. Some items are purchased sporadically due to their long average lifetime (e.g. consumer electronics devices, vehicles and household appliances). The interest for such products can be modeled by a function that grows (linearly or exponentially) from the time of purchase. The zero value of the function coincides with the instant when the user bought the product (denoted by T1 in Table 1), whereas the maximum value 1 is reached once its lifetime has expired (T2). 1 http://www.unspsc.org 2 http://www.eclass-online.com 3 http://registry.eccma.org/eotd 4 http://www.rosettanet.org Y. Blanco-Ferna´ndez et al. / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] 3 Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020
ARTICLE IN PRESS Y. Blanco-Fernandez et aL/ Engineering Applications of Artificial Intelligence i( )In rdf: subClassof ObjectProp rdf: Id rof: ld rof ld rdf- ld (,mw -oHl, The simplest time functions used in our filtering strategy Function Chart Associated products Items that are purchased sporadically due to their long average lifetime. Monotonically decreasing Products that are useful during a limited period and whose utility decreases over time. Rectangle Items that may be repeatedly purchased over a period of time. Constant Products that the user may purchase daily. Monotonically decreasing function. Some seasonal items (eg. Constant function. A constant(time-unaware) function can be wimming pool supplies)are useful during a limited period linked to products that the user may purchase daily, such as and their utility decreases over time. In such cases, the oks or personal hygiene items. temporal dependence can be modeled with a function that takes the maximum value up to the beginning of the season As we shall explain later, the time function to use for a specifi (instant T, in Table 1)and decreases monotonically (linearly or item during the recommendation process is computed from the exponentially) afterwards. The zero value is reached once the functions linked to the classes it belongs to, and also from the seasonal period has ended (instant T2) functions linked to its attributes. By default, attributes are associated a bound eriod of time. This is the case, for example, of nougat the description of the rectangle function),while each class inherits the Rectangle function. A rectangle function (see Table 1)can be to a constant function of value 1, which can be modified as per those months, the zero value of the function prevents su ible to disregard the inheritance process by manually assig gning products from appearing in any recommendation specific functions to the classes and attributes of specific items. Please cite this article as: Blanco-Fernandez, Y. et al, An improvement for semantics-based recommender systems grounded on attaching temporal information. Engineering Applications of Artificial Intelligence(2011), doi: 10. 1016j-engappai 2011.02.020
Monotonically decreasing function. Some seasonal items (e.g. swimming pool supplies) are useful during a limited period and their utility decreases over time. In such cases, the temporal dependence can be modeled with a function that takes the maximum value up to the beginning of the season (instant T1 in Table 1) and decreases monotonically (linearly or exponentially) afterwards. The zero value is reached once the seasonal period has ended (instant T2). Rectangle function. A rectangle function (see Table 1) can be bound to products that may be repeatedly purchased during a given period of time. This is the case, for example, of nougat bars, which are mainly available around Christmas—out of those months, the zero value of the function prevents such products from appearing in any recommendation. Constant function. A constant (time-unaware) function can be linked to products that the user may purchase daily, such as books or personal hygiene items. As we shall explain later, the time function to use for a specific item during the recommendation process is computed from the functions linked to the classes it belongs to, and also from the functions linked to its attributes. By default, attributes are associated to a constant function of value 1, which can be modified as per marketing criteria (remember examples about nougat bars given in the description of the rectangle function), while each class inherits the temporal behavior of its immediate superclass. In any case, it is also possible to disregard the inheritance process by manually assigning specific functions to the classes and attributes of specific items. Fig. 1. A little excerpt from our ontology. Table 1 The simplest time functions used in our filtering strategy. Function Chart Associated products Monotonically increasing Items that are purchased sporadically due to their long average lifetime. Monotonically decreasing Products that are useful during a limited period and whose utility decreases over time. Rectangle Items that may be repeatedly purchased over a period of time. Constant Products that the user may purchase daily. 4 Y. Blanco-Ferna´ndez et al. / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020
ARTICLE IN PRESS Y. Blanco-Ferndndez et aL Engineering Applications of Artificial Intelligence i (l)Il a Fig. 2. Computation of group correction:(a)the starting zero function; (b)a sample pulse train and (c)the resulting group correction. 3. 2. Building group corrections from user profiles and stereotypes between those of the books b, and b2, whereas Books Up to this point, everything is independent of the individual preferences and needs of any user(as it was in blanco-Fernandez Our stereotypes take the same form as the individual profiles, et al. 2008), so we need additional artifacts to incorporate the though completely void of information that might serve to sers'personal interests into the filtering process. To this aim, for identify individual users. In other words, a stereotype is an the reasons explained at the end of Section 2. 2, we do not proceed excerpt from the ontology with attached DOls. These DOls can dividually, but rather with groups of users who may be be anonymously updated from the given/inferred ratings of the clustered together as per some of their preferences. Next, we users, just knowing their degree of membership to the stereotype explain the structure of the user profiles we have been in question (a number whose computation will be explained handling to capture the knowledge available about the user, and later). Actually, the feedback messages include (i) the user's then introduce a notion of stereotypes to characterize the degree of membership to a given stereotype, (if)the rating given preferences of groups of users(potential audiences for certain to(or inferred for)an item, and (ii)the time when the rating was given/inferred. We use that information to build and maintain In our work, a user's profile stores various data, including a function called group correction, intended to modify the default record of the items he/she bought in the past, the classes and time dependence that results for an item from its classes and attributes that describe those items in the ontology and the time attributes. Starting with the zero function that is assumed by of the last purchase. Furthermore each item is linked to a number default (ie, in the absence of feedback ) we compute one group Don)of the user in it(-1 represents the greatest disliking: 1 the function of time, using the procedure depicted in F1g. 2. s as a between- and 1 that measures the degree of interest(hereafter correction gc(Cm, Sj, t) for each class Cm of a stereotype greatest liking). In formal notation, this number is denoted by DOI(,U)=xE[-1, 1]and represents the interest of the user U in First, we record the ratings received for items belonging to the the item i. DOl indexes may be given explicitly by the user, or inferred indirectly by monitoring his/her interaction with the Second, we build a pulse train by averaging the ratings of each recommendations(e.g. if a user decides to buy a recommended instant, weighed by the degrees of membership to the ster item, then we can assume a very positive rating for it-see lopez type S of the users who provided them Nores et al. 2010 for more examples). In any case, the DOls of the Finally, we approximate the pulse train by a natural smoothing ems propagate to attributes and classes as follows uDan o that each piece of feedback has an The Dol of an attribute is calculated by averaging the ratings of instant for which it was issued. ot only at the specific time effect over a pel e resulting curve is trimmed the items that are joined to it in the ontology. In the excerpt ween -1 and 1 from ontology depicted in Fig. 1, for example, the attributes "fall of the Bastille"and"Old Regime Crisis"inherit the ratings We devised group corrections as an additive adjustment of the given by the user to books b1 and b2(0.8 an spectively ) items'time functions. so the result of the sum At the bottom of the hierarchy, the dol of a leaf class is 0 and 1 and normalized to take the maximum value 1 - yields calculated by averaging the ratings of all the items that belong another time function. As shown in Fig. 3, the corrections can to it Upwards, each class averages the DOls of its child classes, completely modify the shape of the temporal dependence curves, as assuming a neutral rating (of value 0)for unrated classes. Fc needed to reckon the fact that the purchasing behaviors of certain example, History"in Fig. 1 receives a Dol of 0.9, halfway people may be radically different from those of the majorit Please cite this article as: Blanco-Fernandez, Y, et al, An improvement for semantics-based recommender systems grounded on attaching temporal information. Engineering Applications of Artificial Intelligence (2011). doi: 10.1016/.engappai 2011.02.020
3.2. Building group corrections from user profiles and stereotypes Up to this point, everything is independent of the individual preferences and needs of any user (as it was in Blanco-Ferna´ndez et al., 2008), so we need additional artifacts to incorporate the users’ personal interests into the filtering process. To this aim, for the reasons explained at the end of Section 2.2, we do not proceed individually, but rather with groups of users who may be clustered together as per some of their preferences. Next, we shall explain the structure of the user profiles we have been handling to capture the knowledge available about the user, and then introduce a notion of stereotypes to characterize the preferences of groups of users (potential audiences for certain products). In our work, a user’s profile stores various data, including a record of the items he/she bought in the past, the classes and attributes that describe those items in the ontology and the time of the last purchase. Furthermore, each item is linked to a number between 1 and 1 that measures the degree of interest (hereafter DOI) of the user in it (1 represents the greatest disliking; 1 the greatest liking). In formal notation, this number is denoted by DOIði,UÞ ¼ xA½1,1 and represents the interest of the user U in the item i. DOI indexes may be given explicitly by the user, or inferred indirectly by monitoring his/her interaction with the recommendations (e.g. if a user decides to buy a recommended item, then we can assume a very positive rating for it—see Lo´pezNores et al., 2010 for more examples). In any case, the DOIs of the items propagate to attributes and classes as follows: The DOI of an attribute is calculated by averaging the ratings of the items that are joined to it in the ontology. In the excerpt from ontology depicted in Fig. 1, for example, the attributes ‘‘Fall of the Bastille’’ and ‘‘Old Regime Crisis’’ inherit the ratings given by the user to books b1 and b2 (0.8 and 1, respectively). At the bottom of the hierarchy, the DOI of a leaf class is calculated by averaging the ratings of all the items that belong to it. Upwards, each class averages the DOIs of its child classes, assuming a neutral rating (of value 0) for unrated classes. For example, ‘‘History’’ in Fig. 1 receives a DOI of 0.9, halfway between those of the books b1 and b2, whereas ‘‘Books’’ receives 0.45 as the average of ‘‘History’’ and ‘‘Sciences’’. Our stereotypes take the same form as the individual profiles, though completely void of information that might serve to identify individual users. In other words, a stereotype is an excerpt from the ontology with attached DOIs. These DOIs can be anonymously updated from the given/inferred ratings of the users, just knowing their degree of membership to the stereotype in question (a number whose computation will be explained later). Actually, the feedback messages include (i) the user’s degree of membership to a given stereotype, (ii) the rating given to (or inferred for) an item, and (iii) the time when the rating was given/inferred. We use that information to build and maintain a function called group correction, intended to modify the default time dependence that results for an item from its classes and attributes. Starting with the zero function that is assumed by default (i.e., in the absence of feedback), we compute one group correction gcðCm,Sj,tÞ for each class Cm of a stereotype Sj as a function of time, using the procedure depicted in Fig. 2: First, we record the ratings received for items belonging to the class Cm in the different time instants. Second, we build a pulse train by averaging the ratings of each instant, weighed by the degrees of membership to the stereotype Sj of the users who provided them. Finally, we approximate the pulse train by a natural smoothing spline (Eubank, 1999), so that each piece of feedback has an effect over a period of time and not only at the specific time instant for which it was issued. The resulting curve is trimmed between 1 and 1. We devised group corrections as an additive adjustment of the items’ time functions, so the result of the sum – trimmed between 0 and 1 and normalized to take the maximum value 1 – yields another time function. As shown in Fig. 3, the corrections can completely modify the shape of the temporal dependence curves, as needed to reckon the fact that the purchasing behaviors of certain people may be radically different from those of the majority. Fig. 2. Computation of group correction: (a) the starting zero function; (b) a sample pulse train and (c) the resulting group correction. Y. Blanco-Ferna´ndez et al. / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] 5 Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020
ARTICLE IN PRESS Y. Blanco-Fernandez et aL/ Engineering Applications of Artificial Intelligence i( )In T2 Fig 3. Adding a group correction to a time function: (a)a sample time function and (b)the function of Fig 3(a) corrected by Fig. 2(c) Step 1: Stereotype- drven Pre-Filtering Step 2: Classification of u into Stereotypes @ a (S,)&Item (S. R(L,,) gc(S, to gc(Sv,f) Collaburatiesae Content-base stage un, to) f(C.tJ Ligating() matching U/e Step 4: Time-driven Filtering Step 3: Reasoning-driven Filtering Fig 4. The four steps of our semantics-based, time-aware filtering strategy. Furthermore, due to the corrections, it is not really important Time-driven filtering. Finally, we assess the current whether the item providers choose, say, a linearly increasing of the user li in the items selected in step 3 by function or an exponentially increasing one as the default time ering the time functions of their classes and attributes ependency of their products, because the feedback gathered from the users will end up modelling the desired behavior(the one that s group corrections computed for the stereotypes in e/she fits best. optimizes users'satisfaction with the recommendations). 4.1. Step 1: stereotype-driven pre-filtering Our filtering strategy The pre-filtering consists of computing a matching level Having introduced the elements of our personalization framework, between each item Ik in the ontology and each one of the we can now describe our new semantics-based time-aware filtering stereotypes, S(denoted by matching (Tk,Si )) Intuitively. Ik is strategy, which follows a four-step process as depicted in Fig. 4: marked as a potentially interesting item for users that fit in Si when the matching level exceeds a configurable threshold a ep 1: Stereotype-driven pre-filtering. Initially, we perform an The computation of matching levels relies on a semantic simi- offline pre-filtering process driven by the available stereotypes, larity metric to compare Ik with each item Ir rated in Si. This using a semantics-based similarity metric to sort out the different metric considers two items similar when (i) they have a very items by their potential interest for different groups of users. specific common ancestor in a class hierarchy of the ontology Step 2: Classification of the user into stereotypes. In order to (ii) they have common attributes, or(iii)they have sibling attributes refine the pre-selection and obtain the temporal dependence (ie, attributes belonging to the same class in some hierarchy ) For urves that affect a given user li, it is necessary to identify the example, the sunglasses sg, and the snow boots sb have"Skiing tereotypes in which he/she fits best(S and SN in Fig 4). equipment as their lowest common ancestor in our domain Step 3: Reasoning-driven filtering Having classified the user li, we ontology(which is one degree more specific than"Winter Sport match the pre-selected items against the preferences captured in equipment): the movies m1 and m2 share the attribute of involving his/her profile, applying a semar edure that the same starring actor; and the bo about the fall of the bastille brings together content-based and collaborative filtering. and b2 about the Old Regime Crisis sibling attributes because Please cite this article as: Blanco-Fernandez, Y. et al, An improvement for semantics-based recommender systems grounded on attaching temporal information. Engineering Applications of Artificial Intelligence(2011), doi: 10. 1016j-engappai 2011.02.020
Furthermore, due to the corrections, it is not really important whether the item providers choose, say, a linearly increasing function or an exponentially increasing one as the default time dependency of their products, because the feedback gathered from the users will end up modelling the desired behavior (the one that optimizes users’ satisfaction with the recommendations). 4. Our filtering strategy Having introduced the elements of our personalization framework, we can now describe our new semantics-based time-aware filtering strategy, which follows a four-step process as depicted in Fig. 4: Step 1: Stereotype-driven pre-filtering. Initially, we perform an offline pre-filtering process driven by the available stereotypes, using a semantics-based similarity metric to sort out the different items by their potential interest for different groups of users. Step 2: Classification of the user into stereotypes. In order to refine the pre-selection and obtain the temporal dependence curves that affect a given user Ui, it is necessary to identify the stereotypes in which he/she fits best (S2 and SN in Fig. 4). Step 3: Reasoning-driven filtering. Having classified the user Ui, we match the pre-selected items against the preferences captured in his/her profile, applying a semantic reasoning procedure that brings together content-based and collaborative filtering. Step 4: Time-driven filtering. Finally, we assess the current interest of the user Ui in the items selected in step 3 by considering the time functions of their classes and attributes and the group corrections computed for the stereotypes in which he/she fits best. 4.1. Step 1: stereotype-driven pre-filtering The pre-filtering consists of computing a matching level between each item Ik in the ontology and each one of the stereotypes, Sj (denoted by matchingðIk,SjÞ). Intuitively, Ik is marked as a potentially interesting item for users that fit in Sj when the matching level exceeds a configurable threshold a1. The computation of matching levels relies on a semantic similarity metric to compare Ik with each item Ir rated in Sj. This metric considers two items similar when (i) they have a very specific common ancestor in a class hierarchy of the ontology, (ii) they have common attributes, or (iii) they have sibling attributes (i.e., attributes belonging to the same class in some hierarchy). For example, the sunglasses sg1 and the snow boots sb1 have ‘‘Skiing equipment’’ as their lowest common ancestor in our domain ontology (which is one degree more specific than ‘‘Winter Sports equipment’’); the movies m1 and m2 share the attribute of involving the same starring actor; and the books b1 about the fall of the Bastille and b2 about the Old Regime Crisis have sibling attributes because Fig. 3. Adding a group correction to a time function: (a) a sample time function and (b) the function of Fig. 3(a) corrected by Fig. 2(c). Fig. 4. The four steps of our semantics-based, time-aware filtering strategy. 6 Y. Blanco-Ferna´ndez et al. / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020
ARTICLE IN PRESS Y. Blanco-Fernandez et aL Engineering Applications of Artificial Intelligence i (l)Il-Il 7 they deal with different events bound to the French Revolution. DOl(Im, Si)is the rating of Im in si: bviously we get high similarity values when the compared items nSim(k Im, Si) is the semantic imilarity between Ik and are very close in a hierarchy, and when they have many common he item Im rated in s nd sibling attributes. The calculus is given by Eq (1) SemSim(Ik, IrSi) depth(CcA(工kTr) 4.2. Step 2: classification of the user into stereotypes idOL(cag(Ik,),Si) In order to measure up to what point a user ui is represented by a stereotype Si, we compute a degree of membership(denoted as dOM(ui, si))as follows SA: n) DOl(sap(Ik, I,).Si) SAmah(工k,T) (1) First, we create a rating vector for ui -denoted by vi including the dol indexes of the most s closed ant classes in to 1 or-1 In this expression, we have three main contributions representing the items that are most appealing or unappealing to him/her). The first addend valuates the relationship between two items Second, we create a rating vector for each stereotype si Ik and Ir by considering their distance in a hierarchy. This denoted by vs, including the DOls it assigns to the classes process involves the following parameters o depth( CCA(Ik, Tr))is the depth of the lowest common ances- Finally, the degree of membership of ui to Si is computed as tor between the two items that is the number of hier- the Pearson-r correlation(Middleton, 2003)between the two archical links between that ancestor and the root class of vectors as per Eq (3) o depth(k) is the number of the hierarchical links between DOM(ui, S)=corrQu, Vs,) the class of Ik and the root class k Qu[kD2∑k0Dsk2 o depth(tr) is the number of the hierarchical links between he class of Ir and the root class. The second addend valuates the similarity between the items Ik and Ir by taking into account their common attributes and As usual in the literature related to the creation of users neighborhoods(Adomavicius and Tuzhilin, 2005: Montaner et al he ratings defined in the stereotype S]. as shown the follow- 2003), we consider that the user ui is represented Ing parameters O CAmax(Ik, Ir)is the maximum number of common attributes stereotype Si if DOMi, Si) exceeds a configurable threshold a2 minimum between the number of attributes associated to 4.3. Step 3: reasoning-driven filtering Tr and the number of attributes linked to Ir) O CA(k Ir) is the nu f common attributes that Ik and Ir After classifying the user, our filtering strategy focuses on the do share items(identified in step 1) that were found suitable for the o cag(Ik, Ir) is the -th common attribute between I& and I stereotypes that represent him/her(identified in step 2). For each Ol(ca(k Ir),Si)is the rating of cag(Ik, I,) in the stereo- one of those items, Ik, we compute a time-unaware recommen- dation value for the user li by content-based filtering and Analogously the last addend focuses on the similarity betweer collaborative filtering criter Ik and Ir by considering the existence of sibling attributes etween both items and the ratings in the stereotype Sj. The The content-based approach uses the same semantic similar- arameters involved in the computation are ity metric explained above to compute the matching level o SAma(Ik, Ir) is the maximum number of sibling attributes between Ik and the items rated in li s profile. According to that Ik and Ir could share (i.e, the minimum between the Eq(4), matching(kli) is high when Tk is strongly related number of attributes associated to Ik and the number of to items that were very appealing to ui attributes linked to Ir) SA(k, Ir) is the number of sibling attributes that Ik and Ir SemSim(Tk, Im, lAi. DOI(Im uli) do share matching(Ik i=NI(ui) m=1 o sap(Ik, Ir)is the p-th sibling attribute d o DOI(sap(Tk,Lr),si is the rating of sap(Ik,Ir) in the stere type si o MI(i) is the number of items rated in di's profile o Im is the m-th of those items: The matching between an item Ik and a stereotype Si is high o DOI(m, li is the rating of Im in uis profil when Ik is very similar to items that appear with high DOls in s o SemSim(k,Im Mi)is the semantic similarity between Ik The value is given by Eq (2): nd the Im rated in li's profile 12(s Semsim(xk,工m,S)·DOH(xmS) If the value resulting from Eg.(4)is greater than a configurable threshold a3, Ik is selected for the final filtering step. Otherwise, as shown in Fig. 4, it is recon- sidered from a collaborative filtering perspective. The collaborative approach attempts to predict li s rating for MI(Si) is the number of items rated in s an item Ik by consi nces of individuals with Im is the m-th of those items: similar interest(his/her neighbors). The identification of Please cite this article as: Blanco-Fernandez, Y, et al, An improvement for semantics-based recommender systems grounded on attaching temporal information. Engineering Applications of Artificial Intelligence (2011). doi: 10.1016/.engappai 2011.02.020
they deal with different events bound to the French Revolution. Obviously, we get high similarity values when the compared items are very close in a hierarchy, and when they have many common and sibling attributes. The calculus is given by Eq. (1): SemSimðIk,Ir,SjÞ ¼ depthðLCAðIk,IrÞÞ max½depthðIkÞ,depthðIrÞ þ PCAðIk,IrÞ q ¼ 1 DOIðcaqðIk,IrÞ,SjÞ CAmaxðIk,IrÞ þ PSAðIk,IrÞ p ¼ 1 DOIðsapðIk,IrÞ,SjÞ SAmaxðIk,IrÞ ð1Þ In this expression, we have three main contributions: The first addend valuates the relationship between two items Ik and Ir by considering their distance in a hierarchy. This process involves the following parameters: J depthðLCAðIk,IrÞÞ is the depth of the lowest common ancestor between the two items, that is, the number of hierarchical links between that ancestor and the root class of the hierarchy. J depthðIkÞ is the number of the hierarchical links between the class of Ik and the root class. J depthðIrÞ is the number of the hierarchical links between the class of Ir and the root class. The second addend valuates the similarity between the items Ik and Ir by taking into account their common attributes and the ratings defined in the stereotype Sj, as shown the following parameters: J CAmaxðIk,IrÞ is the maximum number of common attributes that Ik and Ir could share in the ontology (i.e., the minimum between the number of attributes associated to Ik and the number of attributes linked to Ir); J CAðIk,IrÞ is the number of common attributes that Ik and Ir do share; J caqðIk,IrÞ is the q-th common attribute between Ik and Ir; J DOIðcaqðIk,IrÞ,SjÞ is the rating of caqðIk,IrÞ in the stereotype Sj. Analogously, the last addend focuses on the similarity between Ik and Ir by considering the existence of sibling attributes between both items and the ratings in the stereotype Sj. The parameters involved in the computation are: J SAmaxðIk,IrÞ is the maximum number of sibling attributes that Ik and Ir could share (i.e., the minimum between the number of attributes associated to Ik and the number of attributes linked to Ir); J SAðIk,IrÞ is the number of sibling attributes that Ik and Ir do share; J sapðIk,IrÞ is the p-th sibling attribute between Ik and Ir; J DOIðsapðIk,IrÞ,SjÞ is the rating of sapðIk,IrÞ in the stereotype Sj. The matching between an item Ik and a stereotype Sj is high when Ik is very similar to items that appear with high DOIs in Sj. The value is given by Eq. (2): matchingðIk,SjÞ ¼ 1 NIðSjÞ NI X ðSjÞ m ¼ 1 SemSimðIk,I m,SjÞ DOIðI m,SjÞ ð2Þ where: NIðSjÞ is the number of items rated in Sj; I m is the m-th of those items; DOIðI m,SjÞ is the rating of I m in Sj; SemSimðIk,I m,SjÞ is the semantic similarity between Ik and the item I m rated in Sj. 4.2. Step 2: classification of the user into stereotypes In order to measure up to what point a user Ui is represented by a stereotype Sj, we compute a degree of membership (denoted as DOMðUi,SjÞ) as follows: First, we create a rating vector for Ui – denoted by VUi – including the DOI indexes of the most significant classes in his/her profile (i.e., the ones with DOIs close to 1 or 1, representing the items that are most appealing or unappealing to him/her). Second, we create a rating vector for each stereotype Sj – denoted by VSj – including the DOIs it assigns to the classes of VUi . Finally, the degree of membership of Ui to Sj is computed as the Pearson-r correlation (Middleton, 2003) between the two vectors as per Eq. (3): DOMðUi,SjÞ ¼ corrðVUi ,VSj Þ ¼ P kVUi ½k VSj ½k ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P kðVUi ½kÞ2 P kðVSj ½kÞ2 q ð3Þ As usual in the literature related to the creation of users’ neighborhoods (Adomavicius and Tuzhilin, 2005; Montaner et al., 2003), we consider that the user Ui is represented by the stereotype Sj if DOMðUi,SjÞ exceeds a configurable threshold a2. 4.3. Step 3: reasoning-driven filtering After classifying the user, our filtering strategy focuses on the items (identified in step 1) that were found suitable for the stereotypes that represent him/her (identified in step 2). For each one of those items, Ik, we compute a time-unaware recommendation value for the user Ui by content-based filtering and collaborative filtering criteria: The content-based approach uses the same semantic similarity metric explained above to compute the matching level between Ik and the items rated in Ui’ s profile. According to Eq. (4), matchingðIk,UiÞ is high when Ik is strongly related to items that were very appealing to Ui: matchingðIk,UiÞ ¼ 1 NIðUiÞ NI X ðUiÞ m ¼ 1 SemSimðIk,I m,UiÞ DOIðI m,UiÞ ð4Þ J NIðUiÞ is the number of items rated in Ui’ s profile; J I m is the m-th of those items; J DOIðI m,UiÞ is the rating of I m in Ui’ s profile; J SemSimðIk,I m,UiÞ is the semantic similarity between Ik and the I m rated in Ui’ s profile. If the value resulting from Eq. (4) is greater than a configurable threshold a3, Ik is selected for the final filtering step. Otherwise, as shown in Fig. 4, it is reconsidered from a collaborative filtering perspective. The collaborative approach attempts to predict Ui’ s rating for an item Ik by considering the preferences of individuals with similar interest (his/her neighbors). The identification of Y. Blanco-Ferna´ndez et al. / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] 7 Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020
ARTICLE IN PRESS Y. Blanco-Fernandez et aL. Engineering Applications of Artificial Intelligence i(I )In like-minded users is driven by the same procedure as the Having done this, we truncate values to fit the range [0, 1 classification of the user into stereotypes: roughly, we create (Eq 9)and, finally, we normalize to have the maximum value 1 at nd correlate the rating vectors of ui and the other users, some time(Eq. 10). selecting as neighbors the ones who yield the M greatest ifa(工k,l,t)>1 denoted by PredRating(k u4;)- is computed by Eq (5), where ci(k 4 D=o if ci( li. t)<0 ai(Tk.ui, t) otherwise hey have provided for it, or by the matching level between their preferences and Ik if no dols are available CI(lk.li. t) cii(lk, li, t) max,(cii(kli. t)) PredAting(k)=元 corr(u4,p)·b(Np) In these expressions M)={0) if Nn has rated T (5) NC(k)is the number of classes that the item Ix belongs to in the ontology, Cm being the m-th NA(k) is the number of attributes joined to Ik in the ontology M is the size of li A being the e-th Wp is the p-th of uli's neighbors NS(i) is the number of stereotypes in which the user ui fits o corr(u. VN )is the correlation between the rating vectors DOM(ui, S )is the degree of membership of u; to the stereotype ui and Np, computed by Eq(3). Again, Ik is selected for the final filtering step if the value .f(Cm, t) is the value of the time function associated with class esulting from Eq. (5)exceeds a configurable threshold a4 .f(A, t) is the value of the time function associated with As indicated by Eq (6). the time-unaware recommendation value of the item Tk for the user ui(denoted as RuIli))is . gC(Cm, Sj, t)is the value of the group correction corresponding taken as the matching level between Ik and the items rated to the class cm as per the stereotype Sj. li's profile if Ik was selected by content-based filtering, or as is predicted rating for Ik if it was selected by collaborative An item Ik is finally recommended to the user ui if the time- filtering aware recommendation value computed by Eg(7)exceeds a configurable threshold as. The values of our a; thresholds(witl ReIkI iE[1, 2,3, 4,5))must be set and tuned empirically, depending on [matching(zkL if Is was pre-selected by content-based stage what policies are considered most suitable when it comes to I PredAting(z 4-) if I was pre-selected by collaborative stage deciding what items are relevant for each stereotype (in case of (6) 1), what users belong to each stereotype(a2), and what items are 4.4. Step 4: time-driven filtering relevant to each user(a3, a4 and as). If the granularity of the itemsand users' characterization is high enough, we can be picky The final step of our filtering strategy consists of assessing the and choose values 1: if the characterizations are coarse (as it commonly happens when there are few items or users ) we have current interest of the user in the items selected by semantic to be more permissive and choose lower values. In the tests we reasoning. To this aim, we multiply the time-un re recom- mendation value given by Eq.(6) by a factor CI(kli to) that will describe in the next section, we have used 1=0.6. 02=0.65 results from valuating the items time function -modified by the 23=0.7, a4=0.7 and as=0.75 pertinent group corrections- at the current instant to R(k,l1,t0)=R以(k,l1)·c(k,l1,t0) 5. Experiments and evaluation The computation of CI(k, li to) involves two components: We have made experiments in laboratory to corroborate two research hypothese First, we consider the time functions associated to the classes. First, we postulate that the consideration of tim and attributes of the item T in the ontology. In the computa- tion of the(uncorrected) time function for the item Ik, as ndicated by the first addend of Eg.(8), we average the his hypothesis, we compared the success of the recom functions of its classes and reshape the resulting curve multi- tions made by three filtering approaches, which we plying by the time functions of its attribute referring to as follows Second, we take into account the group corrections corre- o Purely reasoning-based sponding to the classes of Ik as per the stereotypes in which etailed in Lopez-Nores et al. (2010), this approach relies on the user li fits best. As shown in the second addend of eq. 8). emantic reasoning mechanisms to make recommendations the influence of the group correction corresponding to those in a way that disregards any time-aware filtering process. classes in each stereotype Si is weighed by the degree of o Uncorrected time-driven filtering(UTDF): this approach com- membership of the user li to it. bines semantic reasoning and filtering driven by parame terized time functions, just like the limited approach to ∑m21f(cm,t)·I1f(4,D etaL.(2008) ci(Ik,li, t= o Group corrections-driven filtering (GCDF): this is the pproach described in this paper, which modifies the 1DOM(4,S)∑m2 gC(Cm, s,D tapes of time functions by group corrections built from NC(K) consumption stereotypes. Please cite this article as: Blanco-Fernandez, Y. et al, An improvement for semantics-based recommender systems grounded on attaching temporal information. Engineering Applications of Artificial Intelligence(2011), doi: 10. 1016j-engappai 2011.02.020
like-minded users is driven by the same procedure as the classification of the user into stereotypes: roughly, we create and correlate the rating vectors of Ui and the other users, selecting as neighbors the ones who yield the M greatest correlation values. Then, Ui’ s predicted rating for Ik – denoted by PredRatingðIk,UiÞ – is computed by Eq. (5), where the interest of Ui’ s neighbors in Ik is measured by the DOIs they have provided for it, or by the matching level between their preferences and Ik if no DOIs are available: PredRatingðIk,UiÞ ¼ 1 M XM p ¼ 1 corrðVUi ,VN p Þ dðN pÞ dðN pÞ ¼ DOIðIk,N pÞ if N p has rated Ik matchingðIk,N pÞ otherwise ( ð5Þ J M is the size of Ui’ s neighborhood; J N p is the p-th of Ui’ s neighbors; J corrðVUi ,VN p Þ is the correlation between the rating vectors of Ui and N p, computed by Eq. (3). Again, Ik is selected for the final filtering step if the value resulting from Eq. (5) exceeds a configurable threshold a4. As indicated by Eq. (6), the time-unaware recommendation value of the item Ik for the user Ui (denoted as RVðIk,Ui)) is taken as the matching level between Ik and the items rated in Ui’ s profile if Ik was selected by content-based filtering, or as Ui’ s predicted rating for Ik if it was selected by collaborative filtering: RVðIk,UiÞ ¼ matchingðIk,UiÞ if Ik was pre selected by content based stage PredRatingðIk,UiÞ if Ik was pre selected by collaborative stage ( ð6Þ 4.4. Step 4: time-driven filtering The final step of our filtering strategy consists of assessing the current interest of the user in the items selected by semantic reasoning. To this aim, we multiply the time-unaware recommendation value given by Eq. (6) by a factor CIðIk,Ui,t0Þ that results from valuating the item’s time function – modified by the pertinent group corrections – at the current instant t0: RVðIk,Ui,t0Þ ¼ RVðIk,UiÞ CIðIk,Ui,t0Þ ð7Þ The computation of CIðIk,Ui,t0Þ involves two components: First, we consider the time functions associated to the classes and attributes of the item Ik in the ontology. In the computation of the (uncorrected) time function for the item Ik, as indicated by the first addend of Eq. (8), we average the functions of its classes and reshape the resulting curve multiplying by the time functions of its attributes. Second, we take into account the group corrections corresponding to the classes of Ik as per the stereotypes in which the user Ui fits best. As shown in the second addend of Eq. (8), the influence of the group correction corresponding to those classes in each stereotype Sj is weighed by the degree of membership of the user Ui to it. ciðIk,Ui,tÞ ¼ PNCðIkÞ m ¼ 1 fðCm,tÞ QNAðIkÞ l ¼ 1 fðAl,tÞ NCðIkÞ þ PNSðUiÞ j ¼ 1 DOMðUi,SjÞ NSðUiÞ PNCðIkÞ m ¼ 1 gcðCm,Sj,tÞ NCðIkÞ ð8Þ Having done this, we truncate values to fit the range [0, 1] (Eq. 9) and, finally, we normalize to have the maximum value 1 at some time (Eq. 10). ciiðIk,Ui,tÞ ¼ 1 if ciðIk,Ui,tÞ41 0 if ciðIk,Ui,tÞo0 ciðIk,Ui,tÞ otherwise 8 >: ð9Þ CIðIk,Ui,tÞ ¼ ciiðIk,Ui,tÞ maxtðciiðIk,Ui,tÞÞ ð10Þ In these expressions: NCðIkÞ is the number of classes that the item Ik belongs to in the ontology, Cm being the m-th; NAðIkÞ is the number of attributes joined to Ik in the ontology, Al being the l-th; NSðUiÞ is the number of stereotypes in which the user Ui fits; DOMðUi,SjÞ is the degree of membership of Ui to the stereotype Sj; fðCm,tÞ is the value of the time function associated with class Cm; fðAl,tÞ is the value of the time function associated with attribute Al; gcðCm,Sj,tÞ is the value of the group correction corresponding to the class Cm as per the stereotype Sj. An item Ik is finally recommended to the user Ui if the timeaware recommendation value computed by Eq. (7) exceeds a configurable threshold a5. The values of our ai thresholds (with iAf1,2,3,4,5g) must be set and tuned empirically, depending on what policies are considered most suitable when it comes to deciding what items are relevant for each stereotype (in case of a1), what users belong to each stereotype (a2), and what items are relevant to each user (a3, a4 and a5). If the granularity of the items’ and users’ characterization is high enough, we can be picky and choose values 1; if the characterizations are coarse (as it commonly happens when there are few items or users), we have to be more permissive and choose lower values. In the tests we will describe in the next section, we have used a1 ¼ 0:6, a2 ¼ 0:65, a3 ¼ 0:7, a4 ¼ 0:7 and a5 ¼ 0:75. 5. Experiments and evaluation We have made experiments in laboratory to corroborate two research hypotheses: First, we postulate that the consideration of time helps improve the accuracy of the recommendations. To validate this hypothesis, we compared the success of the recommendations made by three filtering approaches, which we will be referring to as follows: J Purely reasoning-based filtering (abbreviated as PRBF): as detailed in Lo´pez-Nores et al. (2010), this approach relies on semantic reasoning mechanisms to make recommendations in a way that disregards any time-aware filtering process. J Uncorrected time-driven filtering (UTDF): this approach combines semantic reasoning and filtering driven by parameterized time functions, just like the limited approach to time-aware filtering we proposed in Blanco-Ferna´ndez et al. (2008). J Group corrections-driven filtering (GCDF): this is the approach described in this paper, which modifies the shapes of time functions by group corrections built from consumption stereotypes. 8 Y. Blanco-Ferna´ndez et al. / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020
ARTICLE IN PRESS Y. Blanco-Ferndndez et aL Engineering Applications of Artificial Intelligence i (l)Il Over the testing period Adaptation of MiSPOT Real Users'Preferences on of Users Recommendation ANOVATests K Checking Conditions StatisticalAnalysis Collection The second hypothesis focuses on the two time-aware approaches, stating that group corrections serve to achieve more accurate recommendations than the parameter ec iwe Ski helmets functions alone Dealers See also The evaluation was organized as per the flow diagram depi- cted in Fig. 5. First, we gathered information about the prefer had developed for the testing of recommender systems in online When you finally stomp ping. Next, we categorized the users into three groups, that each group would receive only the recommendations pro- been working on you'll be duced by one of the three filtering strategies mentioned above. After receiving and processing the feedback provided daily by the iro built in twelve vents users, we repeated the personalization process by making new recommendations for each user day after day over the whole testing period. Next, we processed all the data gathered from th Giro g9 helmet Bu users in statistical tests, previously validating certain conditions required to run these tests. The last step of the evaluation was the nterpretation of the attained results Fig. 6. Some advertising material inserted by the MiSPOT system 5.1. Adapting the MiSPOT prototype gathered the profiles that did not meet any of those conditions. system presented in Lopez-Nores et al.(2010), which enabl non-invasive and personalized form of advertising to domestic Ols of the profiles they contained. As a result, just to name a few examples, we got one Sports-related stereotype containing mainly and mobile Digital TV receivers. Originally, the system relied on items classified as practice equipment for football and basketball PRBf to select advertisements suited to the preferences, interests the Technology-related stereotype contained products like smart and needs of each individual viewer. Thereupon, it uses multi- phones and Digital TV tuners, whereas camping equipment was a media composition abilities defined by the MPEg-4 standard to blend the advertising material with the tv program the user is prevailing category in the Nature-related stereotype. Finally, we asked each user to rate his/her interest in topics viewing at any time. The advertisements can be set to launch related to sports, nature technology, science, health, culture and interactive commercials, as shown in the snapshot depicted traveling with a number between 0 and 10, and initialized their aking advantage of the modular design of the individual profiles by weighing the Dols of the corresponder type, we could modify its personalization logic to of the stereotypes. Finally, we asked the users to identify their most three filtering strategies we wanted to evaluate. all the recent purchases out of a list of 355 products. users would be faced with a common interface 5.3. Formation of user groups, recommendation and feedback 5.2. Real users' preferences collection Our tests involved 95 users recruited among graduate/under Users were randomly assigned to each one of the three duate students from the University of Vigo, their relatives and evaluated filtering approaches, resulting groups with 28, 31 and friends. We ended up with a diverse audience, with disparate 36 individuals. Users included in each group interacted with the demographic data and educational backgrounds, including nearly corresponding filtering approach during at least 8 h over a peric as many men as women(53% vS. 47%)whose ages range from 17 of 3 months After each session, they were faced with a fixed-size list of the items recommended by each filtering approach, which Prior to making any recommendations, we defined a set of 15 they had to rate between 0 and 10(see Fig. 7). At the end, we consumption stereotypes by clustering the user profiles that had collected the log files and computed the values of precision built up during our previou details in Lopez-Nores et al, 2010). The precision value for a user was computed as the percentage of We started from 14 clusters which contained the profiles recommended items that he/she had rated equal to or greater that had comparatively high(close to 1)or comparatively low than 6 over the 3-month testing period. The precision values for (close to -1)DOls for items classified under Sports, Nature, the groups of 28, 31 and 36 individuals were computed by Technology, Science, Health, Culture or Traveling. One final cluster averaging the precision values computed for their members. Please cite this article as: Blanco-Fernandez, Y, et al, An improvement for semantics-based recommender systems grounded on attaching temporal information. Engineering Applications of Artificial Intelligence (2011). doi: 10.1016/.engappai 2011.02.020
The second hypothesis focuses on the two time-aware approaches, stating that group corrections serve to achieve more accurate recommendations than the parameterized time functions alone. The evaluation was organized as per the flow diagram depicted in Fig. 5. First, we gathered information about the preferences of a set of real users, who interacted with a prototype we had developed for the testing of recommender systems in online shopping. Next, we categorized the users into three groups, so that each group would receive only the recommendations produced by one of the three filtering strategies mentioned above. After receiving and processing the feedback provided daily by the users, we repeated the personalization process by making new recommendations for each user day after day over the whole testing period. Next, we processed all the data gathered from the users in statistical tests, previously validating certain conditions required to run these tests. The last step of the evaluation was the interpretation of the attained results. 5.1. Adapting the MiSPOT prototype The users in our experiments interacted with the MiSPOT system presented in Lo´ pez-Nores et al. (2010), which enables a non-invasive and personalized form of advertising to domestic and mobile Digital TV receivers. Originally, the system relied on PRBF to select advertisements suited to the preferences, interests and needs of each individual viewer. Thereupon, it uses multimedia composition abilities defined by the MPEG-4 standard to blend the advertising material with the TV program the user is viewing at any time. The advertisements can be set to launch interactive commercials, as shown in the snapshot depicted in Fig. 6. Taking advantage of the modular design of the MiSPOT prototype, we could modify its personalization logic to run any of the three filtering strategies we wanted to evaluate, so that all the users would be faced with a common interface. 5.2. Real users’ preferences collection Our tests involved 95 users recruited among graduate/undergraduate students from the University of Vigo, their relatives and friends. We ended up with a diverse audience, with disparate demographic data and educational backgrounds, including nearly as many men as women (53% vs. 47%) whose ages range from 17 to 58 years old. Prior to making any recommendations, we defined a set of 15 consumption stereotypes by clustering the user profiles that had built up during our previous experiments with MiSPOT (see details in Lo´pez-Nores et al., 2010). We started from 14 clusters which contained the profiles that had comparatively high (close to 1) or comparatively low (close to 1) DOIs for items classified under Sports, Nature, Technology, Science, Health, Culture or Traveling. One final cluster gathered the profiles that did not meet any of those conditions. From each cluster, we computed one stereotype by averaging the DOIs of the profiles they contained. As a result, just to name a few examples, we got one Sports-related stereotype containing mainly items classified as practice equipment for football and basketball; the Technology-related stereotype contained products like smart phones and Digital TV tuners, whereas camping equipment was a prevailing category in the Nature-related stereotype. Finally, we asked each user to rate his/her interest in topics related to sports, nature, technology, science, health, culture and traveling with a number between 0 and 10, and initialized their individual profiles by weighing the DOIs of the corresponding stereotypes. Finally, we asked the users to identify their most recent purchases out of a list of 355 products. 5.3. Formation of user groups, recommendation and feedback collection Users were randomly assigned to each one of the three evaluated filtering approaches, resulting groups with 28, 31 and 36 individuals. Users included in each group interacted with the corresponding filtering approach during at least 8 h over a period of 3 months. After each session, they were faced with a fixed-size list of the items recommended by each filtering approach, which they had to rate between 0 and 10 (see Fig. 7). At the end, we collected the log files and computed the values of precision attained for each user and each group over the testing period. The precision value for a user was computed as the percentage of recommended items that he/she had rated equal to or greater than 6 over the 3-month testing period. The precision values for the groups of 28, 31 and 36 individuals were computed by averaging the precision values computed for their members. Adaptation of MiSPOT prototype Formation of Users Groups Recommendation Users' Feedback Collection Checking Conditions of StatisticalAnalysis ANOVATests Discussion on Experimental Results Real Users' Preferences Collection Over the testing period Fig. 5. Our experimental evaluation flow diagram. Fig. 6. Some advertising material inserted by the MiSPOT system. Y. Blanco-Ferna´ndez et al. / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] 9 Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020
ARTICLE IN PRESS Y. Blanco-Fernandez et al/ Engineering Applications of Artificial Intelligence i(un)In is large enough(ie if we gather a sufficient number of ratings from the user ) By computing partial estimations of precision Valore los productos recomendados with only the ratings provided in each session, we got ★★★★ histograms like the ones depicted in Fig. 8 which show that ★★★女★★★★ partial estimations of precision fit reasonably well with a Gaussian function, so the overall estimations should fit even ★★★★★★★ better Los plares de la Tierra Libr On the other hand, when estimating precision for a group of users, we averaged 28, 31 or 36 approximately-normal vari ables. Just because the users made up a diverse audience, we ★★★★★★★★一 can reasonably assume that they have provided independent ★★★★★★★ ratings, implying that those variables were also independent. Therefore, the normal distribution was indeed a good approx aAa cuatro quests Comida imation for the estimator of the precision achieved by each strategy. We asked SPSS to provide Q-Q plots to compare the precision values measured for the three groups of users against a standard normal population. As it can be seen in Fig 9, the linearity of the points round the principal diagonal suggests that the precision values do not deviate from a random sample Fig. 7. List of recommended items which have been rated by a user. from a normal distribution in any systematic manner As regards the other conditions, independence of cases is 5.4. Checking conditions for statistical analys arguably true because the observations e from three domly assigned groups of users. Finally, we used Levene s test for As per our research hypotheses, our experiments had to: homogeneity of variances in order to confirm the plausibility of (i)analyze the effect of the adopted filtering approach on the homocedasticity orecision of recommendations, and (ii)consider the impact of group corrections on the resulting time-aware recommendations. 5.5. ANOVA tests To this aim, as usual in the evaluation of recommender systems (Huang et al., 2002: Kim et al., 2005 ), we conducted ANOVA (ANalysis Of VAriance)tests with the aid of the statistical software Having checked the necessary assumptions, we used sPss(i)to SPSS, by considering as target variable the precision values obtain information about means, standard deviations and con fidence intervals of the precision values computed by the three observed for the three groups of users. ANOVA is a statistical test evaluated filtering approaches(see Table 2), and(i)to carry out used to determine whether more than two population means are equal( Gamst et al. 2008: Jaccard, 2003) The test uses the f the ANOVA tests shown in Table 3, where F statistics result from probability distribution function and information about the var- dividing the variation existing between the group averages by the variation a between the precision values within each group. Besides decide whether variability between populations and variability the quantification of both variation sources(denoted as Sums of anova tests was to determine whether the means of the three the value adopted by each estimator of population variance(Mean groups of users were equal (ie, whether the precision of the Squares). which is obtained by dividing the sum of squares by the recommendations was independent from the filtering approach). corresponding degrees of freedom. In order to draw correct statistical inferences from the results of ANoVA, the population must be normal in shape the groups 5.6. Discussion on the experimental results must be independent, and the population variances must be homogeneous (which is typically called the homocedasticity Assessing significant differences among three groups'means hypothesis). We have corroborated these assumptions with the requires to compare the F statistic value in Table 3 against a aid of SPSS. Normality is corroborated both theoretically and critical value, which must be queried in predefined tables con- empirically sidering a specific significance value(o=0.01 in our tests)an the degrees of freedom of ANOVA tests(2 and 92). Since the As we mentioned before, we estimated precision values for resulting F value is 112.703, which is much larger than individual users as the percentage of recommended items that Ftab(0.01, 2,92)=4.844, we reject the hypothesis of equal popula hey had rated equal to or greater than 6 over the testing tion means and conclude that the precision of recommendation period. If a user has rated n items, the estimator can be seen as varies with the adopted filtering strategy. Besides, the P-value the average of n Bernoulli variables, characterized by the in Table 3 is 6. 23 x 10(<0.01), so the test statistic is significant probability p with which he/she would give a rating equal or at that level. greater than 6 to a given item. Assuming that those bernoulli Up to this point, we have corroborated the existence of variables are independent(which is a reasonable approximation significant differences among the three populations, but our two due to the diversity of items that could be recommended). the starting research hypotheses are not yet validated. In order to estimator of precision is closely related to a binomial distribu- analyze the trend of precision as per the filtering approach, we tion, which can be approximated by a normal distribution if n ran contrasts and post hoc tests provided by SPss to compare the average values of the three groups. As depicted in Table 4, we made two contrasts: the first one compares the purely reasoning. s Statistical Package for the based approach against time-aware approaches jointly, while the among the second contrast focuses only on the two time-aware approaches ratings given by one user to different I umiture. by comparing the average precision values of UTDF and GCDF. Please cite this article as: Blanco-Fernandez, Y. et al, An improvement for semantics-based recommender systems grounded on attaching temporal information. Engineering Applications of Artificial Intelligence(2011), doi: 10. 1016j-engappai 2011.02.020
5.4. Checking conditions for statistical analysis As per our research hypotheses, our experiments had to: (i) analyze the effect of the adopted filtering approach on the precision of recommendations, and (ii) consider the impact of group corrections on the resulting time-aware recommendations. To this aim, as usual in the evaluation of recommender systems (Huang et al., 2002; Kim et al., 2005), we conducted ANOVA (ANalysis Of VAriance) tests with the aid of the statistical software SPSS,5 by considering as target variable the precision values observed for the three groups of users. ANOVA is a statistical test used to determine whether more than two population means are equal (Gamst et al., 2008; Jaccard, 2003). The test uses the F probability distribution function and information about the variance of each population and grouping of populations to help decide whether variability between populations and variability within each population are significantly different. The goal of our ANOVA tests was to determine whether the means of the three groups of users were equal (i.e., whether the precision of the recommendations was independent from the filtering approach). In order to draw correct statistical inferences from the results of ANOVA, the population must be normal in shape, the groups must be independent, and the population variances must be homogeneous (which is typically called the homocedasticity hypothesis). We have corroborated these assumptions with the aid of SPSS. Normality is corroborated both theoretically and empirically: As we mentioned before, we estimated precision values for individual users as the percentage of recommended items that they had rated equal to or greater than 6 over the testing period. If a user has rated n items, the estimator can be seen as the average of n Bernoulli variables, characterized by the probability p with which he/she would give a rating equal or greater than 6 to a given item. Assuming that those Bernoulli variables are independent (which is a reasonable approximation due to the diversity of items that could be recommended6 ), the estimator of precision is closely related to a binomial distribution, which can be approximated by a normal distribution if n is large enough (i.e., if we gather a sufficient number of ratings from the user). By computing partial estimations of precision with only the ratings provided in each session, we got histograms like the ones depicted in Fig. 8 which show that partial estimations of precision fit reasonably well with a Gaussian function, so the overall estimations should fit even better. On the other hand, when estimating precision for a group of users, we averaged 28, 31 or 36 approximately-normal variables. Just because the users made up a diverse audience, we can reasonably assume that they have provided independent ratings, implying that those variables were also independent. Therefore, the normal distribution was indeed a good approximation for the estimator of the precision achieved by each strategy. We asked SPSS to provide Q–Q plots to compare the precision values measured for the three groups of users against a standard normal population. As it can be seen in Fig. 9, the linearity of the points round the principal diagonal suggests that the precision values do not deviate from a random sample from a normal distribution in any systematic manner. As regards the other conditions, independence of cases is arguably true because the observations come from three randomly assigned groups of users. Finally, we used Levene’s test for homogeneity of variances in order to confirm the plausibility of homocedasticity. 5.5. ANOVA tests Having checked the necessary assumptions, we used SPSS (i) to obtain information about means, standard deviations and con- fidence intervals of the precision values computed by the three evaluated filtering approaches (see Table 2), and (ii) to carry out the ANOVA tests shown in Table 3, where F statistics result from dividing the variation existing between the group averages by the variation between the precision values within each group. Besides the quantification of both variation sources (denoted as Sums of Squares), Table 3 contains the associated degrees of freedom and the value adopted by each estimator of population variance (Mean Squares), which is obtained by dividing the sum of squares by the corresponding degrees of freedom. 5.6. Discussion on the experimental results Assessing significant differences among three groups’ means requires to compare the F statistic value in Table 3 against a critical value, which must be queried in predefined tables considering a specific significance value (a ¼ 0:01 in our tests) and the degrees of freedom of ANOVA tests (2 and 92). Since the resulting F value is 112.703, which is much larger than Ftabð0:01,2,92Þ ¼ 4:844, we reject the hypothesis of equal population means and conclude that the precision of recommendations varies with the adopted filtering strategy. Besides, the P-value in Table 3 is 6:23 105ðo0:01Þ, so the test statistic is significant at that level. Up to this point, we have corroborated the existence of significant differences among the three populations, but our two starting research hypotheses are not yet validated. In order to analyze the trend of precision as per the filtering approach, we ran contrasts and post hoc tests provided by SPSS to compare the average values of the three groups. As depicted in Table 4, we made two contrasts: the first one compares the purely reasoningbased approach against time-aware approaches jointly, while the second contrast focuses only on the two time-aware approaches by comparing the average precision values of UTDF and GCDF. Fig. 7. List of recommended items which have been rated by a user. 5 Statistical Package for the Social Sciences (http://www.spss.com). 6 For instance, it seems clear that there should be no interdependency among the ratings given by one user to different movies, clothes, meals or pieces of furniture. 10 Y. Blanco-Ferna´ndez et al. / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] Please cite this article as: Blanco-Ferna´ndez, Y., et al., An improvement for semantics-based recommender systems grounded on attaching temporal information.... Engineering Applications of Artificial Intelligence (2011), doi:10.1016/j.engappai.2011.02.020