Expert Systems with Applications 36(2009)5173-5183 Contents lists available at ScienceDirect Expert Systems with Applications ELSEVIER journalhomepagewww.elsevier.com/locate/eswa A recommender system for research resources based on fuzzy linguistic modeling C Porcel, A.G. Lopez-Herrera, E. Herrera-Viedma Computer Science. University of jaen, 23071 Jaen, Spain dEpartment of Computer Science and Artificial Intelligence University of Granada, 18071, Granada, Spain ARTICLE IN FO A BSTRACT Nowadays, the increasing popularity of Internet has led to an abundant amount of information created and delivered over electronic media. It causes the information access by the users is a complex activity uzzy linguistic modeling nd they need tools to assist them to obtain the required information Recommender systems are tools ulti-granular linguistic information Those objective is to evaluate and filter the great amount of information available in a specific scope to assist the users in their information access processes. Another obstacle is the great variety of represen- tations of information, specially when the users take part in the process, so we need more flexibility in the information processing. The fuzzy linguistic modeling allows to represent and handle flexible informa- tion. Similar problems are appearing in other frameworks, such as digital academic libraries, research offices business contacts etc. we focus on info access processes in technology transfer offices. The aim of this paper is to develop a recommender system for research resources based on fuzzy linguis- tic modeling. The system helps researchers and environment companies allowing them to obtain auto- matically information about research resources(calls or projects) in their interest areas. It is designed using some filtering tools and a particular fuzzy linguistic modeling, called multi-granular fuzzy linguistic modeling, which is useful when we have to assess different qualitative concepts. The system is working in ne University of Granada and experimental results show that it ible and effective e 2008 Elsevier Ltd. All rights reserved. 1 Introduction Advice in the preparation of offers(management, spread and A Technology Transfer Office(TTO) is responsible for putting Support in the elaboration and negotiation of contracts with into action and managing the activities which generate knowledge companies. and technical and scientific collaboration, thus enhancing the Management of contacts. interrelation between researchers at the University and the entre- Technological offer (the elaboration preneurial world and their participation in various support pro- grammes designed to carry out research, development and n of the The advice in the creation of new businesses nnovation activities. the main mission in this office is to encour- Evaluation, protection and transfer of ownership rights both and help, from the University, the generation of knowledge intellectual and industria nd its spread and transfer to the society, with the aim of rapidly meeting society's needs and demands A graphical representation To fulfil these objectives and manage all the services, a Tto is of this mission is shown in Fig. 1(The Centre for Innovation, composed by a team of technicians that are experts in technology transfer. Each one manages a specific task, but all of them must To carry out its objectives, a tto runs a number of services provide information about research resources to the researchers hich we highlight the followings (The Centre for Innovation, and companies, that is bulletins, projects, calls, notices, events, congresses, courses, and so on. This task requires the selection by he expert of suitable researchers to deliver the information. In this Guidance for Research and Development(R&D)and Technology sources is contributing to that Tto experts not being able to spread Transfer funding. the information to the suitable users(both researchers and compa nies)in a simple and timely manner. Then Tto experts are in need of tools to help them cope with the large amount of information E-mail addresses: cporcelQujaenes(C Porcel), viedma@desai. ugr es (E. Herrera- available about research resources. A promising direction to im- prove the information access about research resources concerns 0957-4174/s- see front matter o 2008 Elsevier Ltd. All rights reserved. doi:10.1016eswa2008.06.03
A recommender system for research resources based on fuzzy linguistic modeling C. Porcel a , A.G. López-Herrera a , E. Herrera-Viedma b,* aDepartment of Computer Science. University of Jaén, 23071 Jaén, Spain bDepartment of Computer Science and Artificial Intelligence University of Granada, 18071, Granada, Spain article info Keywords: Recommender systems Information filtering Fuzzy linguistic modeling Multi-granular linguistic information abstract Nowadays, the increasing popularity of Internet has led to an abundant amount of information created and delivered over electronic media. It causes the information access by the users is a complex activity and they need tools to assist them to obtain the required information. Recommender systems are tools whose objective is to evaluate and filter the great amount of information available in a specific scope to assist the users in their information access processes. Another obstacle is the great variety of representations of information, specially when the users take part in the process, so we need more flexibility in the information processing. The fuzzy linguistic modeling allows to represent and handle flexible information. Similar problems are appearing in other frameworks, such as digital academic libraries, research offices, business contacts, etc. We focus on information access processes in technology transfer offices. The aim of this paper is to develop a recommender system for research resources based on fuzzy linguistic modeling. The system helps researchers and environment companies allowing them to obtain automatically information about research resources (calls or projects) in their interest areas. It is designed using some filtering tools and a particular fuzzy linguistic modeling, called multi-granular fuzzy linguistic modeling, which is useful when we have to assess different qualitative concepts. The system is working in the University of Granada and experimental results show that it is feasible and effective. 2008 Elsevier Ltd. All rights reserved. 1. Introduction A Technology Transfer Office (TTO) is responsible for putting into action and managing the activities which generate knowledge and technical and scientific collaboration, thus enhancing the interrelation between researchers at the University and the entrepreneurial world and their participation in various support programmes designed to carry out research, development and innovation activities. The main mission in this office is to encourage and help, from the University, the generation of knowledge and its spread and transfer to the society, with the aim of rapidly meeting society’s needs and demands. A graphical representation of this mission is shown in Fig. 1 (The Centre for Innovation, XXXX). To carry out its objectives, a TTO runs a number of services which we highlight the followings (The Centre for Innovation, XXXX): Information (R&D bulletins, R&D&I, calls, notices, projects). Guidance for Research and Development (R&D) and Technology Transfer funding. Advice in the preparation of offers (management, spread and exploitation). Support in the elaboration and negotiation of contracts with companies. Management of contacts. Technological offer (the elaboration of the offer, spread and promotion). The advice in the creation of new businesses. Evaluation, protection and transfer of ownership rights both intellectual and industrial. To fulfil these objectives and manage all the services, a TTO is composed by a team of technicians that are experts in technology transfer. Each one manages a specific task, but all of them must provide information about research resources to the researchers and companies, that is bulletins, projects, calls, notices, events, congresses, courses, and so on. This task requires the selection by the expert of suitable researchers to deliver the information. In this task, we find a first problem, the large increase of research resources is contributing to that TTO experts not being able to spread the information to the suitable users (both researchers and companies) in a simple and timely manner. Then TTO experts are in need of tools to help them cope with the large amount of information available about research resources. A promising direction to improve the information access about research resources concerns 0957-4174/$ - see front matter 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2008.06.038 * Corresponding author. E-mail addresses: cporcel@ujaen.es (C. Porcel), viedma@decsai.ugr.es (E. HerreraViedma). Expert Systems with Applications 36 (2009) 5173–5183 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
5174 C Porcel et al / Expert Systems with Applications 36(2009)5173-5183 research groups) TTO Fig 1. Main mission in a TTo the way in which it is possible to filter the great amount of infor- The recommender systems can be characterized because they mation available Recommender Systems are tools whose objective ( Hanani et al, 2001; Reisnick Varian, 1997) is to evaluate and filter the great amount of information available in a specific scope to assist the users in their information access are applicable for unstructured or semi-structured data (e.g. processes(Basu, Hirsh, Cohen, 1998: Cao Li, 2007: Hanani Web documents or e-mail messages). Shapira, Shoval, 2001: Hsu, 2008: Ungar, Pennock, Lawrence, the users have long time information needs that are described 2001: Reisnick varian, 1997). by means of user profiles, Another problem is the great variety of representations and handle large amounts of data, evaluations of the information. The problem becomes more notice-. deal primarily with textual data and able when users take part in the S.Therefore, to improve the their objective is to remove irrelevant data from incoming information representations and the user interface we need more ns of data items flexibility in the information processing. To solve this probler we propose the use of Fuzzy Linguistic Modeling(FLm)(Ben-Arieh Traditionally, recommender systems have fallen into two main Zhifeng, 2006: Herrera Herrera-Viedma, 1997: Herrera, categories(Good et al., 1999: Hanani et al Popescul et al, Herrera-Viedma, Martinez, 2008: Herrera, Herrera-Viedma, 2001: Reisnick Varian, 1997). Content- recommender Verdegay, 1996: Herrera Martinez, 2000: Zadeh, 1975)to repre- systems recommend the information by matching the terms used sent and handle flexible information by means of linguistic labels. in the representation of user profiles with the index terms used In this paper, we propose SIREZIN, a recommender system for in the representation of documents, ignoring data from other recommending research resources based on FLM. The system al- users. These recommender systems tend to fail when little lows the researchers to obtain automatically information about re- is known about user information needs. Collaborative recom search resources in their interest areas and it recommends about mender systems use explicit or implicit preferences from many panies or another researchers which could collaborate with users to recommend documents to a given user, ignoring the rep- them in projects(Chang, Wang. Wang, 2007; Chen& Ben-Arieh, resentation of documents. These recommender systems tend to 2006: Herrera MartInez, 2001: Herrera-Viedma, Cordon, Luqu fail when little is known about a user, or when he/she has uncom- Lopez, Munoz, 2003: Herrera-Viedma, Martinez, Mata, mon interests(Popescul et al., 2001). In these kind of systems, th Chiclana, 2005). SIREZIN is designed using both recommendation users'information preferences can be used to define user profiles hniques and the multi-granular FLM to represent and handle that are applied as filters to streams of documents: the recom flexible information by means of linguistic labels. To prove the sys- mendations to a user are based on another users' recommend. tem functionality we have implemented a primary version and the tions with similar profiles. The construction of accurate profiles experimental results shows its useful and effectiveness. is a key task and the systems success will depend on a large The paper is structured as follows: Section 2 revises the recom- extent on the ability of the learned profiles to represent the users mendation approaches and the FLM. Section 3 presents the design preferences(Quiroga Mostafa, 2002). Moreover, we can use a of the system, analyzing its architecture, data structure and activ- hybrid approach to smooth out the disadvantages of each one of ty. Section 4 reports the system evaluatio them and to exploit their benefits(Basu et al., 1998: Claypool, esults. Finally, we point out some concluding remarks. Gokhale, Miranda, 1999: Good et al., 1999: Popescul et al 2. Preliminaries On the other hand, we should point out that the matching pro- cess is a main process in the activity of the recommender systems. 2.1. Recommender systems The two major approaches followed in the design and implementa tion of recommender systems to do the matching are the statistical Information gathering in Internet is a complex activity. Find the approach and the knowledge based approach( Hanani et al, 2001) formation, required for the users, on the Web is not a In our system, we have applied the statistical approach. This ap- simple task. This problem is more acute with the ever increasing proach represents the documents and the user profiles as weighted Ise of the Internet. For example, users who subscribe to internet vectors of index terms To filter the information the system imple- lists waste a great deal of time reading, viewing or deleting irrele- ments a statistical algorithm that computes the similarity of a vee vant e-mail messages. To improve the information access on the tor of terms that represents the data item being filtered to a user's Web the users need tools to filter the great amount of information profile. The most common algorithm used is the Correlation or the available across the Web. Recommender systems can provide Cosine measure between the user's profile and the document's vec information services by delivering the information to people who tor( Korfhage, 1997). need it. It is a research area that offers tools for discriminating be- The recommendation activity is followed by a relevance feed- ween relevant and irrelevant information by providing personal- back phase Relevance feedback is a cyclic process whereby the user ized assistance for continuous retrieval of information(Reisnick feeds back into the system decisions on the relevance of retrieved varian, 1997). documents and the system then uses these evaluations to auto-
the way in which it is possible to filter the great amount of information available. Recommender Systems are tools whose objective is to evaluate and filter the great amount of information available in a specific scope to assist the users in their information access processes (Basu, Hirsh, & Cohen, 1998; Cao & Li, 2007; Hanani, Shapira, & Shoval, 2001; Hsu, 2008; Ungar, Pennock, & Lawrence, 2001; Reisnick & Varian, 1997). Another problem is the great variety of representations and evaluations of the information. The problem becomes more noticeable when users take part in the process. Therefore, to improve the information representations and the user interface we need more flexibility in the information processing. To solve this problem we propose the use of Fuzzy Linguistic Modeling (FLM) (Ben-Arieh & Zhifeng, 2006; Herrera & Herrera-Viedma, 1997; Herrera, Herrera-Viedma, & Martı´ nez, 2008; Herrera, Herrera-Viedma, & Verdegay, 1996; Herrera & Martı´ nez, 2000; Zadeh, 1975) to represent and handle flexible information by means of linguistic labels. In this paper, we propose SIRE2IN, a recommender system for recommending research resources based on FLM. The system allows the researchers to obtain automatically information about research resources in their interest areas and it recommends about companies or another researchers which could collaborate with them in projects (Chang, Wang, & Wang, 2007; Chen & Ben-Arieh, 2006; Herrera & Martı´ nez, 2001; Herrera-Viedma, Cordón, Luque, López, & Muñoz, 2003; Herrera-Viedma, Martı´ nez, Mata, & Chiclana, 2005). SIRE2IN is designed using both recommendation techniques and the multi-granular FLM to represent and handle flexible information by means of linguistic labels. To prove the system functionality we have implemented a primary version and the experimental results shows its useful and effectiveness. The paper is structured as follows: Section 2 revises the recommendation approaches and the FLM. Section 3 presents the design of the system, analyzing its architecture, data structure and activity. Section 4 reports the system evaluation and the experimental results. Finally, we point out some concluding remarks. 2. Preliminaries 2.1. Recommender systems Information gathering in Internet is a complex activity. Find the appropriate information, required for the users, on the Web is not a simple task. This problem is more acute with the ever increasing use of the Internet. For example, users who subscribe to internet lists waste a great deal of time reading, viewing or deleting irrelevant e-mail messages. To improve the information access on the Web the users need tools to filter the great amount of information available across the Web. Recommender systems can provide information services by delivering the information to people who need it. It is a research area that offers tools for discriminating between relevant and irrelevant information by providing personalized assistance for continuous retrieval of information (Reisnick & Varian, 1997). The recommender systems can be characterized because they (Hanani et al., 2001; Reisnick & Varian, 1997): are applicable for unstructured or semi-structured data (e.g. Web documents or e-mail messages), the users have long time information needs that are described by means of user profiles, handle large amounts of data, deal primarily with textual data and their objective is to remove irrelevant data from incoming streams of data items. Traditionally, recommender systems have fallen into two main categories (Good et al., 1999; Hanani et al., 2001; Popescul et al., 2001; Reisnick & Varian, 1997). Content-based recommender systems recommend the information by matching the terms used in the representation of user profiles with the index terms used in the representation of documents, ignoring data from other users. These recommender systems tend to fail when little is known about user information needs. Collaborative recommender systems use explicit or implicit preferences from many users to recommend documents to a given user, ignoring the representation of documents. These recommender systems tend to fail when little is known about a user, or when he/she has uncommon interests (Popescul et al., 2001). In these kind of systems, the users’ information preferences can be used to define user profiles that are applied as filters to streams of documents; the recommendations to a user are based on another users’ recommendations with similar profiles. The construction of accurate profiles is a key task and the system’s success will depend on a large extent on the ability of the learned profiles to represent the user’s preferences (Quiroga & Mostafa, 2002). Moreover, we can use a hybrid approach to smooth out the disadvantages of each one of them and to exploit their benefits (Basu et al., 1998; Claypool, Gokhale, & Miranda, 1999; Good et al., 1999; Popescul et al., 2001). On the other hand, we should point out that the matching process is a main process in the activity of the recommender systems. The two major approaches followed in the design and implementation of recommender systems to do the matching are the statistical approach and the knowledge based approach (Hanani et al., 2001). In our system, we have applied the statistical approach. This approach represents the documents and the user profiles as weighted vectors of index terms. To filter the information the system implements a statistical algorithm that computes the similarity of a vector of terms that represents the data item being filtered to a user’s profile. The most common algorithm used is the Correlation or the Cosine measure between the user’s profile and the document’s vector (Korfhage, 1997). The recommendation activity is followed by a relevance feedback phase. Relevance feedback is a cyclic process whereby the user feeds back into the system decisions on the relevance of retrieved documents and the system then uses these evaluations to autoResearchers (research groups) Environment companies TTO Generation of knowledge and its transfer to the society Fig. 1. Main mission in a TTO. 5174 C. Porcel et al. / Expert Systems with Applications 36 (2009) 5173–5183
C Porcel et aL /Expert Systems with Applications 36(2009)5173-518 5175 matically update the user profile(Hanani et al, 2001; Popescul et BE[0. g. Let i=round() and a=B-i be two values, such that, aL., 2001; Reisnick& Varian, 1997) iE[0, g and aE[-5,5) then a is called a Symbolic Translation. Another important aspect that we must have in mind when we design a recommender system is the method to gather user infor The 2-tuple fuzzy linguistic approach is developed from the mation. In order to discriminate between relevant and irrelevant concept of symbolic translation by representing the linguistic information for a user, we must have some information about this information by means of 2-tuples (S, x),s ndx∈|-5,5 user,i.e, we must know the user preferences. Information about user preferences can be obtained in two different ways(Hanani S represents the linguistic label of the information, and et al., 2001; Quiroga Mostafa, 2002), implicit and explicit mode, z is a numerical value expressing the value of the translation although these ways not be mutually exclusive. from the original result B to the closest index label, i, in the lin- The implicit approach is implemented by inference from ustic term set(s∈S kind of observation. The observation is applied to user behavior or to detecting a user's environment(such as bookmarks or visited This model defines a set of transformation functions between URL). The user preferences are updated by detecting changes while numeric values and 2-tuples. bserving the user. On the other hand, the explicit approach, inter- Definition 2.(Herrera 8 Martinez, 2000). Let S=(So,., sgl be a acts with the users by acquiring feedback on information that is fil- linguistic term set and B E[O g] a value representing the result of a tered, that is, the user expresses some specifications of what they symbolic aggregation operation, then the 2-tuple that expresses desire. This last approach is very used(Hanani et al., 2001: the equivalent information to B is obtained with the following Popescul et al, 2001: Reisnick Varian, 1997). function: 2. 2. Fuzzy linguistic modeling A:10.g]→S×[0505) There are situations in which the information cannot be as- 4(B)=(Si, az),with i= round(B x=B-ia∈|-.5.5) sessed precisely in a quantitative form but may be in a qualitative one. For example, when attempting to quality phenomena related where round() is the usual round operation, s, has the closest index to human perception, we are often led to use words in natural lan- label to"Band"o" is the value of the symbolic translation. guage instead of numerical values. In other cases, precise quantita- tive information cannot be stated because either it is unavailable or For all 4 there exists A-l, defined as 4-(si, a)=i+a. On the the cost for its computation is too high and an approximate value other hand, it is obvious that the conversion of a linguistic term can be applicable The use of Fuzzy Sets Theory has given very good into a linguistic 2-tuple consists of adding a symbolic translation results for modeling qualitative information(Zadeh, 1975)and it value of 0: S:ES=(s4,0) has proven to be useful in many problems, e.g., in decision making The computational model is defined by presenting the following (Herrera, Herrera-Viedma, Verdegay, 1996: Herrera et al, 1996: operators: Herrera, Herrera-Viedma, Verdegay, 1998: Xu, 2006), quality Herrera-Viedma Peis, 2003: Herrera-Viedma, Peis, Morales del 2. Comparison of 2-tuples(Sk, a,)and(S, 42): s a)) evaluation(Herrera-Vviedma, Pasi, Lopez-Herrera, Porcel, 2006: 1. Negation operator: Neg((si, a))=4(g-(4"( astillo, Alonso, Anaya, 2007), information retrieval(Herrera- Viedma, 2001; Herrera-Viedma, 2001: Herrera-Viedma lopez Ifk=I then Herrera, 2007: Herrera-Viedma, Lopez-Herrera, Luque, porcel )if 1=2 then (Sk, a1)and (S a2) represent the same 2007: Herrera-Viedma, Lopez-Herrera, Porcel, 2005), political information analysis(Arf, 2005), etc. It is a tool based on the concept of linguis (b)if a1 a2 then(Sk, a,)is smaller than(S,%) tic variable proposed by Zadeh(1975). Next we analyze the two ap- (c)if a,>a2 then(Sk, a1) is bigger than(S 2). proaches of FLM that we use in our system. 3. Aggregation operators. The aggregation of information consists of obtaining a value that summarizes a set of values, therefore, 2. 1. The 2-tuple fuzzy linguistic approach the result of the aggregation of a set of 2-tuples must be a 2-tuple. The 2-tuple FLM(Herrera Martinez, 2000) is a continuous In the literature we can find many aggregation operators which model of representation of information that allows to reduce the allow us to combine the information according to different crite- loss of information typical of other fuzzy linguistic approaches ria. Using functions 4 and 4 that transform without loss of (classical and ordinal Herrera Herrera-Viedma, 1997: Zadeh, nformation numerical values into linguistic 2-tuples and vice 1975). To define it we have to establish the 2-tuple representation versa, any of the existing aggregation operator can be easily model and the 2-tuple computational model to represent and extended for dealing with linguistic 2-tuples. Some examples are aggregate the linguistic information, respectively. Let S=(so,.,Ssl be a linguistic term set with odd cardinality, Definition 3. Arithmetic mean: Let x=((r1, a1)..... (rn, n)be a set where the mid term represents a indifference value and the rest of of linguistic 2-tuples, the 2-tuple arithmetic mean xe is computed the terms are symmetric relate to it. we assume that the semanti of labels is given by means of triangular membership functions and consider all terms distributed on a scale on which a total order is defined, S <S<i<j. In this fuzzy linguistic context, if a sym- bolic method(Herrera Herrera-Viedma, 1997: Herrera et al 1996)aggregating linguistic information obtains a value BE[0,g]. Definition 4. Weighted average operator: Let x=((r1,a1),., g), then an approximation function is used to ex-(a, a.) be a set of linguistic 2-tuples and w= wi,..., wn) be thei press the result in S. associated weights. The 2-tuple weighted average xis Definition 1.(Herrera E Martinez, 2000 ) Let B be the result of an aggregation of the indexes of a set of labels assessed in a linguistic Tn,an)=4 erm set S, i.e, the result of a symbolic aggregation operation
matically update the user profile (Hanani et al., 2001; Popescul et al., 2001; Reisnick & Varian, 1997). Another important aspect that we must have in mind when we design a recommender system is the method to gather user information. In order to discriminate between relevant and irrelevant information for a user, we must have some information about this user, i.e., we must know the user preferences. Information about user preferences can be obtained in two different ways (Hanani et al., 2001; Quiroga & Mostafa, 2002), implicit and explicit mode, although these ways not be mutually exclusive. The implicit approach is implemented by inference from some kind of observation. The observation is applied to user behavior or to detecting a user’s environment (such as bookmarks or visited URL). The user preferences are updated by detecting changes while observing the user. On the other hand, the explicit approach, interacts with the users by acquiring feedback on information that is filtered, that is, the user expresses some specifications of what they desire. This last approach is very used (Hanani et al., 2001; Popescul et al., 2001; Reisnick & Varian, 1997). 2.2. Fuzzy linguistic modeling There are situations in which the information cannot be assessed precisely in a quantitative form but may be in a qualitative one. For example, when attempting to qualify phenomena related to human perception, we are often led to use words in natural language instead of numerical values. In other cases, precise quantitative information cannot be stated because either it is unavailable or the cost for its computation is too high and an approximate value can be applicable. The use of Fuzzy Sets Theory has given very good results for modeling qualitative information (Zadeh, 1975) and it has proven to be useful in many problems, e.g., in decision making (Herrera, Herrera-Viedma, & Verdegay, 1996; Herrera et al., 1996; Herrera, Herrera-Viedma, & Verdegay, 1998; Xu, 2006), quality evaluation (Herrera-Viedma, Pasi, López-Herrera, & Porcel, 2006; Herrera-Viedma & Peis, 2003; Herrera-Viedma, Peis, Morales del Castillo, Alonso, & Anaya, 2007), information retrieval (HerreraViedma, 2001; Herrera-Viedma, 2001; Herrera-Viedma & LópezHerrera, 2007; Herrera-Viedma, López-Herrera, Luque, & Porcel, 2007; Herrera-Viedma, López-Herrera, & Porcel, 2005), political analysis (Arfi, 2005), etc. It is a tool based on the concept of linguistic variable proposed by Zadeh (1975). Next we analyze the two approaches of FLM that we use in our system. 2.2.1. The 2-tuple fuzzy linguistic approach The 2-tuple FLM (Herrera & Martı´ nez, 2000) is a continuous model of representation of information that allows to reduce the loss of information typical of other fuzzy linguistic approaches (classical and ordinal Herrera & Herrera-Viedma, 1997; Zadeh, 1975). To define it we have to establish the 2-tuple representation model and the 2-tuple computational model to represent and aggregate the linguistic information, respectively. Let S ¼ fs0; ... ; sg g be a linguistic term set with odd cardinality, where the mid term represents a indifference value and the rest of the terms are symmetric relate to it. We assume that the semantics of labels is given by means of triangular membership functions and consider all terms distributed on a scale on which a total order is defined, si 6 sj () i 6 j. In this fuzzy linguistic context, if a symbolic method (Herrera & Herrera-Viedma, 1997; Herrera et al., 1996) aggregating linguistic information obtains a value b 2 ½0; g, and b R f0; ... ; gg; then an approximation function is used to express the result in S. Definition 1. (Herrera & Martı´nez, 2000). Let b be the result of an aggregation of the indexes of a set of labels assessed in a linguistic term set S, i.e., the result of a symbolic aggregation operation, b 2 ½0; g. Let i ¼ roundðbÞ and a ¼ b i be two values, such that, i 2 ½0; g and a 2 ½:5; :5Þ then a is called a Symbolic Translation. The 2-tuple fuzzy linguistic approach is developed from the concept of symbolic translation by representing the linguistic information by means of 2-tuples ðsi; aiÞ, si 2 S and ai 2 ½:5; :5Þ: si represents the linguistic label of the information, and ai is a numerical value expressing the value of the translation from the original result b to the closest index label, i, in the linguistic term set (si 2 S). This model defines a set of transformation functions between numeric values and 2-tuples. Definition 2. (Herrera & Martı´nez, 2000). Let S ¼ fs0; ... ; sgg be a linguistic term set and b 2 ½0; g a value representing the result of a symbolic aggregation operation, then the 2-tuple that expresses the equivalent information to b is obtained with the following function: D : ½0; g ! S ½0:5; 0:5Þ; DðbÞ¼ðsi; aÞ; with si i ¼ roundðbÞ; a ¼ b i a 2 ½:5; :5Þ; where roundðÞ is the usual round operation, si has the closest index label to ‘‘b” and ‘‘a” is the value of the symbolic translation. For all D there exists D1 , defined as D1 ðsi; aÞ ¼ i þ a. On the other hand, it is obvious that the conversion of a linguistic term into a linguistic 2-tuple consists of adding a symbolic translation value of 0: si 2 S ) ðsi; 0Þ. The computational model is defined by presenting the following operators: 1. Negation operator: Negððsi; aÞÞ ¼ Dðg ðD1 ðsi; aÞÞÞ. 2. Comparison of 2-tuples ðsk; a1Þ and ðsl; a2Þ: If k a2 then ðsk; a1Þ is bigger than ðsl; a2Þ. 3. Aggregation operators. The aggregation of information consists of obtaining a value that summarizes a set of values, therefore, the result of the aggregation of a set of 2-tuples must be a 2-tuple. In the literature we can find many aggregation operators which allow us to combine the information according to different criteria. Using functions D and D1 that transform without loss of information numerical values into linguistic 2-tuples and viceversa, any of the existing aggregation operator can be easily extended for dealing with linguistic 2-tuples. Some examples are Definition 3. Arithmetic mean: Let x ¼ fðr1; a1Þ; ... ;ðrn; anÞg be a set of linguistic 2-tuples, the 2-tuple arithmetic mean xe is computed as xe ½ðr1; a1Þ; ... ;ðrn; anÞ ¼ D Xn i¼1 1 n D1 ðri; aiÞ ! ¼ D 1 n Xn i¼1 bi !: Definition 4. Weighted average operator: Let x ¼ fðr1; a1Þ; ... ; ðrn; anÞg be a set of linguistic 2-tuples and W ¼ fw1; ... ; wng be their associated weights. The 2-tuple weighted average xw is xw½ðr1;a1Þ;...;ðrn;anÞ ¼ D Pn i¼1D1 ðri P ;aiÞ wi n i¼1 wi ! ¼ D Pn i¼1bi P wi n i¼1 wi : C. Porcel et al. / Expert Systems with Applications 36 (2009) 5173–5183 5175
C Porcel et al / Expert Systems with Applications 36(2009)5173-5183 Definition 5. Linguistic weighted average operator: Let x=((r1, a1) (. (n, Mn)) be a set of linguistic 2-tuples and W=l(wi, am (Wn, a)) be their linguistic 2-tuple asso linguistic weighted average x is x[(r1,x1),(W1,x7)…(rn,n),Wwn,xn)=4 ∑1B1·Bw with B=A"(, a) and Bw=4"(w, a]). 2.2.2. The multi-granular fuzzy linguistic modeling In any fuzzy linguistic approach, an important parameter to determinate is the "granularity of uncertainty",i.e, the cardinali of the linguistic term set S. According to the uncertainty degi that an expert qualifying a phenomenon has on it, the linguistic term set chosen to provide his knowledge will have more or less terms. When different experts have different uncertainty degrees Fig. 2. Linguistic Hierarchy of 3, 5 and 9 labels. on the phenomenon, then several linguistic term sets with a differ ent granularity of uncertainty are necessary(herrera Martinez 2001: Herrera-Viedma et al, 2005). The use of different labels sets we select a level to make uniform the information( for instance, the to assess information is also necessary when an expert has to as- great granularity level)and then we can use the operators defined ess different concepts, as for example it happens in information in the 2-tuple FLM. retrieval problems, to evaluate the importance of the query terms and the relevance of the retrieved documents(Herrera-Viedma et 3. SIRE2IN, a Recommender system for research resources L 2003). In such situations, we need tools for the management of multi-granular linguistic information. In Herrera Martinez (2001)is proposed a multi-granular 2-tuple FLM based on the con- based on multi-granular FLM F SIRE2IN, a recommender system In this section, we pre cept of linguistic hierarchy( Cordon, Herrera, Zwir, 2001). A Linguistic Hierarchy, LH, is a set of levels I(t, n(t), i.e., As we said in the introduction, the Tto technicians manage and LH=U,I(t, n(t)). where each level t is a linguistic term set with dif- SPread a lot of information about research information such as calls or projects. Nowadays, this amount of information is growing up ferent granularity n(t)from the remaining of levels of the hierarchy and the experts are in need of automatic tools to filter and spread ( Cordon et al, 2001 ). The levels are ordered according to their the information in a simple and timely manner. Because of this,our granularity. i.e,a level t+ 1 provides a linguistic refinement of system incorporates in its activity a filtering process that follows the previous level t. We can define a level from its predecessor le- the content-based approach. Moreover, to improve the representa- el as: I(t, n(t))-(t+1,2,.,n(t)-1). Table 1 shows the granu- tion of the information in the system we use multi-granular lin- larity needed in each linguistic term set of the level t depending on the value n(t) defined in the first level (3 and 7, respectively).a guistic information, that is, different label sets to represent the graphical example of a linguistic hierarchy is shown in Fig. 2 ifferent concepts to be assessed for different users in the filtering In Herrera Martinez(2001) was demonstrated that the lin activity guistic hierarchies are useful to represent the multi-granular lin Then, SIRE2IN filters the incoming information stream and gen guistic information and allow to combine multi-granular erates useful recommendations to the suitable researchers in accordance with their research areas. For each user the system linguistic information without loss of information. To do this, a generates an email with a summary about the resources, its rele- family of transformation functions between labels from different vance degrees and recomm ns about collaboration Definition 6. Let LH=U(t, n(t)) be a linguistic hierarchy whose linguistic term sets are denoted as S()=(so",., s n(t-1). The 3. 1. Architecture transformation function between a 2-tuple that belongs to level t and another 2-tuple in level tzt is defined The architecture of SIRE2IN(Fig 3)has three main components: F:lt,n(t)→l(t,n(t) Resources management. This module is the responsible one of (=9-4(remo0=1) management the information sources from which the Tto xperts receive all the information about research resources. It obtains an internal representation of these items. Examples of As it was pointed out in Herrera Martinez(2001) this family of information sources are Internet, news bulletins, distribution transformation functions is bijective. This result guarantees the lists, forums, etc. To manage the items, we represent them in transformations between levels of a linguistic hierarchy are carried accordance with its scope using the UNESco terminology for ut without loss of information. To define the computational model, the science and technology(The UNESCO terminology, XXXX). This terminology is composed by three levels and each one is a refinement of the previous level. The first level includes gen eral topics and they are codified by two digits. Each topic includes some disciplines codified by four digits in a second Level 3 level. The third level is composed by subdisciplines that sent the activities developed in each discipline; these sub I(1.3 l(2.5) plines are codified by six digits. We are going to operat the first and second levels, because we think the third level sup-
Definition 5. Linguistic weighted average operator: Let x ¼ fðr1; a1Þ; ð... ;ðrn; anÞg be a set of linguistic 2-tuples and W ¼ fðw1; aw 1 Þ; ... ; ðwn; aw n Þg be their linguistic 2-tuple associated weights. The 2-tuple linguistic weighted average xw l is xw l ½ððr1; a1Þ;ðw1; aw 1 ÞÞ...ððrn; anÞ;ðwn; aw n ÞÞ ¼ D Pn i¼1bi P bWi n i¼1bWi !; with bi ¼ D1 ðri; aiÞ and bWi ¼ D1ðwi; aw i Þ. 2.2.2. The multi-granular fuzzy linguistic modeling In any fuzzy linguistic approach, an important parameter to determinate is the ‘‘granularity of uncertainty”, i.e., the cardinality of the linguistic term set S. According to the uncertainty degree that an expert qualifying a phenomenon has on it, the linguistic term set chosen to provide his knowledge will have more or less terms. When different experts have different uncertainty degrees on the phenomenon, then several linguistic term sets with a different granularity of uncertainty are necessary (Herrera & Martı´ nez, 2001; Herrera-Viedma et al., 2005). The use of different labels sets to assess information is also necessary when an expert has to assess different concepts, as for example it happens in information retrieval problems, to evaluate the importance of the query terms and the relevance of the retrieved documents (Herrera-Viedma et al., 2003). In such situations, we need tools for the management of multi-granular linguistic information. In Herrera & Martı´nez (2001) is proposed a multi-granular 2-tuple FLM based on the concept of linguistic hierarchy (Cordón, Herrera, & Zwir, 2001). A Linguistic Hierarchy, LH, is a set of levels lðt; nðtÞÞ, i.e., LH ¼ S tlðt; nðtÞÞ, where each level t is a linguistic term set with different granularity nðtÞ from the remaining of levels of the hierarchy (Cordón et al., 2001). The levels are ordered according to their granularity, i.e., a level t þ 1 provides a linguistic refinement of the previous level t. We can define a level from its predecessor level as: lðt; nðtÞÞ ! lðt þ 1; 2; ... ; nðtÞ 1Þ. Table 1 shows the granularity needed in each linguistic term set of the level t depending on the value n(t) defined in the first level (3 and 7, respectively). A graphical example of a linguistic hierarchy is shown in Fig. 2. In Herrera & Martı´nez (2001) was demonstrated that the linguistic hierarchies are useful to represent the multi-granular linguistic information and allow to combine multi-granular linguistic information without loss of information. To do this, a family of transformation functions between labels from different levels was defined: Definition 6. Let LH ¼ S tlðt; nðtÞÞ be a linguistic hierarchy whose linguistic term sets are denoted as SnðtÞ ¼ fs nðtÞ 0 ; ... ; s nðtÞ nðtÞ1g. The transformation function between a 2-tuple that belongs to level t and another 2-tuple in level t 0 –t is defined as: TFt t0 : lðt; nðtÞÞ ! lðt 0 ; nðt 0 ÞÞ; TFt t0ðs nðtÞ i ; anðtÞ Þ ¼ D D1 ðs nðtÞ i ; anðtÞ Þðnðt 0 Þ 1Þ nðtÞ 1 !: As it was pointed out in Herrera & Martı´nez (2001) this family of transformation functions is bijective. This result guarantees the transformations between levels of a linguistic hierarchy are carried out without loss of information. To define the computational model, we select a level to make uniform the information (for instance, the great granularity level) and then we can use the operators defined in the 2-tuple FLM. 3. SIRE2IN, a Recommender system for research resources In this section, we present SIRE2IN, a recommender system based on multi-granular FLM. As we said in the introduction, the TTO technicians manage and spread a lot of information about research information such as calls or projects. Nowadays, this amount of information is growing up and the experts are in need of automatic tools to filter and spread the information in a simple and timely manner. Because of this, our system incorporates in its activity a filtering process that follows the content-based approach. Moreover, to improve the representation of the information in the system we use multi-granular linguistic information, that is, different label sets to represent the different concepts to be assessed for different users in the filtering activity. Then, SIRE2IN filters the incoming information stream and generates useful recommendations to the suitable researchers in accordance with their research areas. For each user the system generates an email with a summary about the resources, its relevance degrees and recommendations about collaboration possibilities. 3.1. Architecture of SIRE2IN The architecture of SIRE2IN (Fig. 3) has three main components: Resources management. This module is the responsible one of management the information sources from which the TTO experts receive all the information about research resources. It obtains an internal representation of these items. Examples of information sources are Internet, news bulletins, distribution lists, forums, etc. To manage the items, we represent them in accordance with its scope using the UNESCO terminology for the science and technology (The UNESCO terminology, XXXX). This terminology is composed by three levels and each one is a refinement of the previous level. The first level includes general topics and they are codified by two digits. Each topic includes some disciplines codified by four digits in a second level. The third level is composed by subdisciplines that represent the activities developed in each discipline; these subdisciplines are codified by six digits. We are going to operate with the first and second levels, because we think the third level supFig. 2. Linguistic Hierarchy of 3, 5 and 9 labels. Table 1 Linguistic hierarchies Level 1 Level 2 Level 3 lðt; nðtÞÞ l(1, 3) l(2, 5) l(3, 9) lðt; nðtÞÞ l(1, 7) l(2, 13) 5176 C. Porcel et al. / Expert Systems with Applications 36 (2009) 5173–5183
C Porcel et aL. /Expert Systems with Applications 36(2009)5173-518 5177 Information sources Resources representation -Relevant resourcess users Users User profiles Fig. 3. Structure of SIRE2IN ply a discrimination level too much high and this could difficult. target: this field indicates the kind of users which is oriented the the interaction with the users moreover for each resource we resource, that is researchers, companies or anybody, store another kind of information that the system uses in the fil- minimum and maximum amount: it indicates the minimum and maxi- tering process mum amount that the user can solicit User profiles management. The users can be researchers of the. scope: the system manages the resources in accordance with their scope University or employees of the environment companies. In both nt the resource scope we use the vector model where for each cases, the system operates with an internal representation of the resource the system stores a vector VR, i.e., a ordered list of terms. To user's preferences or needs, that is, the system represents each uild this vector we follow the unesco ter gy(The UNESCO user through an user profile. To define a user profile we use terminology, XXXX), specifically we use the second level. This level the basic information about the user and his/ her topics of inter has 248 disciplines, so the vector must have 248 positions, one position est, represented also by the UNESCO terminology(The UNESCO for each discipline. In each position the vector stores a 2-tuple linguistic terminology, XXXX), i.e. each user have a list of UNESCO codes alue which represents the importance degree for the resource scope of ccording to hisher information needs or interests. Both the UNESCO code represented esearch groups and companies have assigned a set of UNESCO codes that define their research activity. So, initially the system To set up a user profile we use the following information: assigns to each user the UNESCO codes of his/ her research grou or company and afterwards, users can update their profiles in a s il feedback phase in which the users can express some explicit password: necessary to access the system, specifications of their preferences dni: identity national document, Filtering process. This component filters the incoming informa- name and surname tion to deliver it to the fitting users. The filtering process is based department and center if the user is a University researcher or the com- on a matching process. As our system is a content-based recom- ly if the user is a company employee, mender system, it filters the information by matching the terms address phone number, mobile phone and fa terms used in the representation of resources. Later, we will email: elemental information to send the resources study this process in detail taking into account the data recommendations structures research group: only if the user belongs to a research group. We use a code which is a string composed by six digits, three characters indicating 3. 2. Data structures he research area and three numbers identifying the group collaboration preferences: if the user want to collaborate with other In this subsection, we are going to discuss the data structur researchers of a distinct group, with companies, with anybody or with nat we need to represent all the information about the users and research resources. We must have in mind that the system minimum and maximum amount: the users define the interval in which tores this information because it does not work with explicit user they have interested in solicit a call, topics of interest: to represent the topics of interest we use the vector To characterize a research resource, we use the following model too, where for each user the system stores a vector VU. To build his vector we follow the UNESCO terminology(The UNESCO termi- nology, XXXX), specifically we use the second level. This level has 248 disciplines, so the vector must have 248 positions, one position for each abstract discipline. In each position the vector stores a 2-tuple linguistic value which represents the importance degree for the users topic of interest date of the UNESCO code represented in that position. b it other hand, to represent the linguistic information we not send all the information but summarized information and the use different label sets, i.e. the communication among the users o access the resource and the system is carried out by using multi-granular linguistic
ply a discrimination level too much high and this could difficult the interaction with the users. Moreover, for each resource we store another kind of information that the system uses in the filtering process. User profiles management. The users can be researchers of the University or employees of the environment companies. In both cases, the system operates with an internal representation of the user’s preferences or needs, that is, the system represents each user through an user profile. To define a user profile we use the basic information about the user and his/her topics of interest, represented also by the UNESCO terminology (The UNESCO terminology, XXXX), i.e. each user have a list of UNESCO codes according to his/her information needs or interests. Both research groups and companies have assigned a set of UNESCO codes that define their research activity. So, initially the system assigns to each user the UNESCO codes of his/her research group or company and afterwards, users can update their profiles in a feedback phase in which the users can express some explicit specifications of their preferences. Filtering process. This component filters the incoming information to deliver it to the fitting users. The filtering process is based on a matching process. As our system is a content-based recommender system, it filters the information by matching the terms used in the representation of user profiles against the index terms used in the representation of resources. Later, we will study this process in detail taking into account the data structures. 3.2. Data structures In this subsection, we are going to discuss the data structures that we need to represent all the information about the users and research resources. We must have in mind that the system stores this information because it does not work with explicit user queries. To characterize a research resource, we use the following information: titular, abstract, text, date, source, link: when the system sends the users information about a resource, it does not send all the information but summarized information and the link to access the resource, target: this field indicates the kind of users which is oriented the resource, that is researchers, companies or anybody, minimum and maximum amount: it indicates the minimum and maximum amount that the user can solicit, scope: the system manages the resources in accordance with their scope. To represent the resource scope we use the vector model where for each resource the system stores a vector VR, i.e., a ordered list of terms. To build this vector we follow the UNESCO terminology (The UNESCO terminology, XXXX), specifically we use the second level. This level has 248 disciplines, so the vector must have 248 positions, one position for each discipline. In each position the vector stores a 2-tuple linguistic value which represents the importance degree for the resource scope of the UNESCO code represented in that position. To set up a user profile we use the following information: user’s identity: usually his/her mail, password: necessary to access the system, dni: identity national document, name and surname, department and center if the user is a University researcher or the company if the user is a company employee, address, phone number, mobile phone and fax, email: elemental information to send the resources and recommendations, research group: only if the user belongs to a research group. We use a code which is a string composed by six digits, three characters indicating the research area and three numbers identifying the group, collaboration preferences: if the user want to collaborate with other researchers of a distinct group, with companies, with anybody or with nobody, minimum and maximum amount: the users define the interval in which they have interested in solicit a call, topics of interest: to represent the topics of interest we use the vector model too, where for each user the system stores a vector VU. To build this vector we follow the UNESCO terminology (The UNESCO terminology, XXXX), specifically we use the second level. This level has 248 disciplines, so the vector must have 248 positions, one position for each discipline. In each position the vector stores a 2-tuple linguistic value which represents the importance degree for the user’s topic of interest of the UNESCO code represented in that position. On the other hand, to represent the linguistic information we use different label sets, i.e. the communication among the users and the system is carried out by using multi-granular linguistic Information sources Resources insertion process Resources representation Matching Process Relevant resourcess for users Feedback Users User profiles Users insertion process Resources management User profile management Filtering process Fig. 3. Structure of SIRE2IN. C. Porcel et al. / Expert Systems with Applications 36 (2009) 5173–5183 5177
C Porcel et al Expert Systems with Applications 36(2009)5173-518 information, in order to allow a higher flexibility in the processes 3.3. Activity of siRE2IN of communication of the system. Therefore the system uses differ- ent label sets(S1, S2,..)to represent the different concepts to be The system activity can be described briefly in three steps assessed in its filtering activity. These label sets Si are chosen from those label sets that compose a LH, 1.e S E LH. We should point An expert receives or finds information about a research out that the number of different label sets that we can use is lim resource and inserts it into the system. ed by the number of levels of LH, and therefore, in many cases the Then, the system runs the matching process to determinate the label sets S and S, can be associated to a same label set of LH but fitting users to receive the information and send them an email with different interpretations depending on the concept to be with the information, the calculated relevance degree and modeled In our system, we distinguish between three concepts ommendations about possible collaborations with other users. that be assessed Once the users have received the information, they can change the kind of information they want to receive in the future, by portance degree(si)of a UNESCO code with respect to a resource updating their user profiles cope or user preferences. relevance degree (Si)of a resource for a researcher or for a In Fig. 4 we can see the main page of the system once the user is logged. Users can access the different options depending on the ompatibility degree (S3)between a researcher and a company, permissions they have assigned (user, expert or administrator ). between researchers of diferent groups and between different 3.3.1. Users insertion process This process consists in to incorporate users'data into the Following the linguistic hierarchy shown in Fig. 2, in our system system. It presents a form where the users insert their personal we use the level 2 (5 labels)to assign importance degree(S1=S) information, collaboration preferences and preferences about the and the level 3(9 labels)to assign relevance degrees(Sz d resources. Users are invited to define their topics of interest and in e, atibility degrees(S3=S). Using this LH the linguistic terms choose importance degrees(assessed in S,)associated with ach level are In order to gather information about users we use a hybrid ap- S=( bo=Null=N, b1=Low=L, b2=Medium=M, b3=High= proach, that is, when we insert a new user we use implicit inforn tion to generate the profile and afterwards the users can update S=co=Null=N, i=Very Low=VL, C2=Low=L, 3=More. their profiles following the explicit approach. Initially a user has Less low=MLL. C4=Medium=M, Cs=More Less High=MLH, associated the topics of interest of his/her research group or com- C6=High=H, c,=Very High= VH Cg=Total=T pany, but he/she can modify them. Each group signed one or more UNESCO codes, so the system assigns him/her So Therefore, for a resource i we have a vector representing its the UNESCO codes of level 2 of his/her group or company with importance degree Total((b4, 0), withb4 E S1). The other positions have a value Null((bo, 0), withbo E S1). Later, the users can update their profiles always they want, accessing to the system and edit where each component VR DE SI, with j 48, stores a lin- ing the uneSco codes or the importance degrees which they have guistic 2-tuple indicating the importance degree of the UNESCo assigned code j with regard to the resource i. These 2-tuples are assigned The system registers users and assigns them an identifier by the tto technician (email)and a password Finally, the users receive a confirmation To represent the topics of interest in the user profiles we fol email with the inserted information he same method, using a vector VU for each user of the sy Then, for the user x, we have a vector Example 1. In this example, we are going to insert a new user. The VUx=(VUx[1]. VUx2],., VUx[248) user fills the form with his her information. Let us suppose the user belongs to a group which works in Science of Nutriment, then he/ where each component VUxD E SI, with y=1... 248, stor she has assigned the UNESCO code 3206. Then, to define the vector ustic 2-tuple indicating the importance degree of the of topics of interest the system assigns the user this code(3206) code y with regard to the preferences of the user x. These with degree Total(b4, 0), witha E S1). With this information the are also assigned by the experts, but the users can edit them when user profile is represented by a vector of topics of interest with the they want. Recommender System about Research Resources user managemen Fig 4. Main page of SIRE2IN
information, in order to allow a higher flexibility in the processes of communication of the system. Therefore the system uses different label sets ðS1; S2; ...Þ to represent the different concepts to be assessed in its filtering activity. These label sets Si are chosen from those label sets that compose a LH, i.e., Si 2 LH. We should point out that the number of different label sets that we can use is limited by the number of levels of LH, and therefore, in many cases the label sets Si and Sj can be associated to a same label set of LH but with different interpretations depending on the concept to be modeled. In our system, we distinguish between three concepts that can be assessed: importance degree (S1) of a UNESCO code with respect to a resource scope or user preferences, relevance degree (S2) of a resource for a researcher or for a company, compatibility degree (S3) between a researcher and a company, between researchers of different groups and between different companies. Following the linguistic hierarchy shown in Fig. 2, in our system we use the level 2 (5 labels) to assign importance degree (S1 ¼ S 5 ) and the level 3 (9 labels) to assign relevance degrees (S2 ¼ S9 ) and compatibility degrees (S3 ¼ S9 ). Using this LH the linguistic terms in each level are S5 ¼ fb0 ¼ Null ¼ N;b1 ¼ Low ¼ L;b2 ¼ Medium ¼ M;b3 ¼ High ¼ H;b4 ¼ Total ¼ Tg S9 ¼ fc0 ¼ Null ¼ N; c1 ¼ Very Low ¼ VL; c2 ¼ Low ¼ L; c3 ¼ More Less Low ¼ MLL; c4 ¼ Medium ¼ M; c5 ¼ More Less High ¼ MLH; c6 ¼ High ¼ H; c7 ¼ Very High ¼ VH; c8 ¼ Total ¼ Tg Therefore, for a resource i we have a vector representing its scope: VRi ¼ ðVRi½1; VRi½2; ... ; VRi½248Þ; where each component VRi½j 2 S1, with j ¼ 1; ... ; 248, stores a linguistic 2-tuple indicating the importance degree of the UNESCO code j with regard to the resource i. These 2-tuples are assigned by the TTO technicians. To represent the topics of interest in the user profiles we follow the same method, using a vector VU for each user of the system. Then, for the user x, we have a vector: VUx ¼ ðVUx½1; VUx½2; ... ; VUx½248Þ; where each component VUx½y 2 S1, with y ¼ 1 ... 248, stores a linguistic 2-tuple indicating the importance degree of the UNESCO code y with regard to the preferences of the user x. These 2-tuples are also assigned by the experts, but the users can edit them when they want. 3.3. Activity of SIRE2IN The system activity can be described briefly in three steps: An expert receives or finds information about a research resource and inserts it into the system. Then, the system runs the matching process to determinate the fitting users to receive the information and send them an email with the information, the calculated relevance degree and recommendations about possible collaborations with other users. Once the users have received the information, they can change the kind of information they want to receive in the future, by updating their user profiles. In Fig. 4 we can see the main page of the system once the user is logged. Users can access the different options depending on the permissions they have assigned (user, expert or administrator). 3.3.1. Users insertion process This process consists in to incorporate users’ data into the system. It presents a form where the users insert their personal information, collaboration preferences and preferences about the resources. Users are invited to define their topics of interest and choose importance degrees (assessed in S1) associated with them. In order to gather information about users we use a hybrid approach, that is, when we insert a new user we use implicit information to generate the profile and afterwards the users can update their profiles following the explicit approach. Initially a user has associated the topics of interest of his/her research group or company, but he/she can modify them. Each group or company has assigned one or more UNESCO codes, so the system assigns him/her the UNESCO codes of level 2 of his/her group or company with importance degree Total (ðb4; 0Þ; withb4 2 S1). The other positions have a value Null (ðb0; 0Þ; withb0 2 S1). Later, the users can update their profiles always they want, accessing to the system and editing the UNESCO codes or the importance degrees which they have assigned. The system registers users and assigns them an identifier (email) and a password. Finally, the users receive a confirmation email with the inserted information. Example 1. In this example, we are going to insert a new user. The user fills the form with his/her information. Let us suppose the user belongs to a group which works in Science of Nutriment, then he/ she has assigned the UNESCO code 3206. Then, to define the vector of topics of interest the system assigns the user this code (3206) with degree Total (ðb4; 0Þ; withb4 2 S1). With this information the user profile is represented by a vector of topics of interest with the following values: Fig. 4. Main page of SIRE2IN. 5178 C. Porcel et al. / Expert Systems with Applications 36 (2009) 5173–5183
C Porcel et aL /Expert Systems with Applications 36(2009)5173-518 79 Js冈x=(b0),ifx=100 VR=(b4,0),ifj=100 Ls9x=(bo, 0), otherwise VR=(b3,0),ifj=118 VR1=(bo, 0), otherwise Remark. The UNESCO code 3206 is in the position 100 of the list so it is stored in VU/s[100 Remark. The UNESCO codes 3206 and 3309 are in the position The Fig. 5 shows an example of of a user access 00 and 118 of the list so they are stored respectively in VRi[100 and vR[118 3.3. 2 Resources insertion process An example of resource list is shown in Fig. 6 This sub-process is carried out by the experts, i.e. the transf chnology technicians that receive or find information about a re- 3.3.3. Filtering process source and they want to spread this information. The experts insert As we have said, we use the vector model(Korfhage, 1997)to the interesting resources into the system and it automatically represent the resource scope and the user's sends the information to the suitable users along with a relevance vector model uses similarity calculations to do the matching pro- degree and collaborations possibilities. cess, such as Euclidean Distance or Cosine Measure. Exactly we As we said in the previous section, the system stores the general use the standard cosine measure(Korfhage, 1997). However, as information about the resource and its scope. The scope is repre sented by a vector of UNESCO codes whereby to insert the resource similarity measure we have linguistic values, we need to introduce a new linguistic the experts decide the UNESCo codes to assign it. Moreover, to manage the linguistic information, the experts also decide a lin- guistic 2-tuple (bi, 24), with b, E S1, to weight the importance de- 01(VR, VU)=4 Ck(4(rk, ark)x4(uk, auk)) gree of each UNESCO code of level 2 with regard to the resource V∑k=1(4(rk,x)2xV∑k=(A(uk,) Hence. to the system, insert all the information about it and number of UNESCO codes of level 2),(k, ark)is the 2-tuple linguistic en the experts are going to insert a new resource, where n is the number of terms used to define the vectors(i.e.the th nally they assess the importance degree of each UNESCO cod value of term k in the resource vector (VR)(uk, auk)is its 2-tuple lin- of level 2 with regard to the resource. To do this, the system shows guistic value in the user vector(VU). With this similarity measure a list of UNESCO codes of level 2 and the experts decide the codes we obtain a linguistic value to assess the similarity between a re- to assign to the resource scope, selecting a code of the list and source and a user. In the case of two users or two resources. this lin- signing it a linguistic label to assess its importance degree. Then guistic similarity measure can be applied in a similar way they accept and can either add another UNESco code or finally the Following this approach, when a new resource has been il resource insertion serted into the system, we compute the linguistic similarity mea sure a(VR, vU)) between the new resource scope vector (VR) Example 2. Now let us suppose the expert receives a call i about a against all the user vectors(VU j=l,., m where m is the num- e research resource. Then, he/she inserts the call ber of users of the system)to find the fit users to deliver this infor into the system, introducing all the available information and mation. If or(VR, VU/)>y, the user is chosen. Previously we have selecting from a list the UNESCO codes which match with the call. defined a linguistic threshold value(v)to filter out the informa- In this example, the expert could select the codes 3206-Science of tion. In this iteration, the system takes into account also the user Nutriment with importance degree Total(b4, 0), witha E S1) and preferences(kind of resources and amounts)to consider the user 3309-Food Technology with degree Very High((b3, 0), withbg E S1). or not. The collaboration preferences are used to classify the se- Once the expert inserts this information, we have a vector VR lected users in two sets, collaborators c and non-collaborators defining the resource i with the following values SIREZIN Recommender System about Research Resources urie Builaing. Campus o 14071 Cordoba Fig. 5. Example of a user access
VUID½x¼ðb4; 0Þ; if x ¼ 100 VUID½x¼ðb0; 0Þ; otherwise: Remark. The UNESCO code 3206 is in the position 100 of the list so it is stored in VUID½100. The Fig. 5 shows an example of a user access. 3.3.2. Resources insertion process This sub-process is carried out by the experts, i.e., the transfer technology technicians that receive or find information about a resource and they want to spread this information. The experts insert the interesting resources into the system and it automatically sends the information to the suitable users along with a relevance degree and collaborations possibilities. As we said in the previous section, the system stores the general information about the resource and its scope. The scope is represented by a vector of UNESCO codes whereby to insert the resource the experts decide the UNESCO codes to assign it. Moreover, to manage the linguistic information, the experts also decide a linguistic 2-tuple ðbi; aiÞ, with bi 2 S1, to weight the importance degree of each UNESCO code of level 2 with regard to the resource scope. Hence, when the experts are going to insert a new resource, they access to the system, insert all the information about it and finally they assess the importance degree of each UNESCO code of level 2 with regard to the resource. To do this, the system shows a list of UNESCO codes of level 2 and the experts decide the codes to assign to the resource scope, selecting a code of the list and assigning it a linguistic label to assess its importance degree. Then they accept and can either add another UNESCO code or finally the resource insertion. Example 2. Now let us suppose the expert receives a call i about a nutriment science research resource. Then, he/she inserts the call into the system, introducing all the available information and selecting from a list the UNESCO codes which match with the call. In this example, the expert could select the codes 3206 – Science of Nutriment with importance degree Total (ðb4; 0Þ; withb4 2 S1) and 3309 – Food Technology with degree Very High (ðb3; 0Þ; withb3 2 S1). Once the expert inserts this information, we have a vector VRi defining the resource i with the following values: VRi½j¼ðb4; 0Þ; if j ¼ 100 VRi½j¼ðb3; 0Þ; if j ¼ 118 VRi½j¼ðb0; 0Þ; otherwise: Remark. The UNESCO codes 3206 and 3309 are in the positions 100 and 118 of the list so they are stored respectively in VRi½100 and VRi½118. An example of resource list is shown in Fig. 6. 3.3.3. Filtering process As we have said, we use the vector model (Korfhage, 1997) to represent the resource scope and the user’s topics of interest. This vector model uses similarity calculations to do the matching process, such as Euclidean Distance or Cosine Measure. Exactly we use the standard cosine measure (Korfhage, 1997). However, as we have linguistic values, we need to introduce a new linguistic similarity measure: rlðVR; VUÞ ¼ D Pn k¼1ðD1 ðrk; arkÞ D1 ðuk; aukÞÞ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pn k¼1ðD1 ðrk; arkÞÞ2 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pn k¼1ðD1 ðuk; aukÞÞ2 q 0 B@ 1 CA; where n is the number of terms used to define the vectors (i.e. the number of UNESCO codes of level 2), ðrk; arkÞ is the 2-tuple linguistic value of term k in the resource vector (VR), ðuk; aukÞ is its 2-tuple linguistic value in the user vector (VU). With this similarity measure we obtain a linguistic value to assess the similarity between a resource and a user. In the case of two users or two resources, this linguistic similarity measure can be applied in a similar way. Following this approach, when a new resource has been inserted into the system, we compute the linguistic similarity measure rlðVRi; VUjÞ between the new resource scope vector (VRi) against all the user vectors (VUj, j ¼ 1; ... ; m where m is the number of users of the system) to find the fit users to deliver this information. If rlðVRi; VUjÞ P w, the user j is chosen. Previously we have defined a linguistic threshold value (w) to filter out the information. In this iteration, the system takes into account also the user preferences (kind of resources and amounts) to consider the user or not. The collaboration preferences are used to classify the selected users in two sets, collaborators UC and non-collaborators UN. Fig. 5. Example of a user access. C. Porcel et al. / Expert Systems with Applications 36 (2009) 5173–5183 5179
C Porcel et aL Expert Systems with Applications 36(2009)5173-5183 Recommender System about Research Resources 代生Ur也 Resource sean它 P7 Cooperation Work Programme: Theme 2-Food, Agriculture and 22122006 Complementary Actions 17042007 Subsidies of National Program of Environmental Sciences and 1807007 Fig. 6. Example of resources list. After this, the system has two sets of selected users WN and c, the resource information and its calculated linguistic relevance and for each user it has a value G(VR, VU))> y The system applies degree. to each aI(VR, vU) the transformation function defined in Defini- For the users in wc the system performs an additional step: it tion 6 to obtain the relevance degree of the resource i for the user calculates the collaboration possibilities between the selected expressed in the set Sz. Then, the system sends to the users of Wn users. To do it, between each two users x, yE we Compute oI(VR,,vUAl o(VR, VU,)2v the amount rank Reject j preferences? es the user i w Include the user in Uc For each user k of with the user kind k? Yes To compute the compatibility degree between j and resource information and its relevance degree Fig. 7. Filtering process for a user j
After this, the system has two sets of selected users UN and UC, and for each user it has a value rlðVRi; VUjÞ P w. The system applies to each rlðVRi; VUjÞ the transformation function defined in Definition 6 to obtain the relevance degree of the resource i for the user j, expressed in the set S2. Then, the system sends to the users of UN the resource information and its calculated linguistic relevance degree. For the users in UC the system performs an additional step; it calculates the collaboration possibilities between the selected users. To do it, between each two users x; y 2 UC: Fig. 6. Example of resources list. Fig. 7. Filtering process for a user j. 5180 C. Porcel et al. / Expert Systems with Applications 36 (2009) 5173–5183
C Porcel et aL /Expert Systems with Applications 36(2009)5173-518 51 to analyze if the users are researchers or company employees area. For a whole evaluation we must include the collaboration rec and take into account the users preferences about it. For exam- ommendations, but in this initial version there arent very much ple, a researcher could want to collaborate only with others users, so we evaluate only the recommendations about research researchers of different research group, resources to calculate the linguistic similarity measure between the users, Gr(VUx, VUy 4.1. Evaluation metrics to obtain the compatibility degree between x and y, expressing o(VUx, VU,)as a linguistic label in S3(using the transformation For the evaluation of recommender syste n, recall function defined in Definition 6) to send it to the user. and fl are measures widely used to evaluate the quality of the rec- ommendations(Cao &Li, 2007: Cleverdon et al, 1966: Sarwar Finally, the system sends to the users of c the resource infor- Karypis, Konstan, 2000). To calculate these me mation, its calculated linguistic relevance degree and the collabo- contingency table to categorize the items with respect to the infor- ration possibilities along with a linguistic compatibility degree. mation needs. The items are classified both as relevant or irrele- The Fig. 7 shows all the process. vant and selected (recommended to the user) or not selected The contingency table (Table 4.1) is created using these four 3.3. 4 Feedback categories. This phase is related to the activity developed by the system Precision is defined as the ratio of the selected relevant items to once the user has taken some of the resources delivered by the sys- the selected items, that is, it measures the probability of a selected tem. As we said, user profiles represent the user's information item be relevant: they should be adaptable since user's needs could change continu- p_Nrs ously. Because of this, the system allows users to update their pro- iles to improve the filtering process. In our system this feedback Recall is calculated as the ratio of the selected relevant items to t process is developed in the following steps: relevant items, that is, it represents the probability of a relevant The user accesses the system entering his/her.g and password items be selected user can do the foll to edit his/her collaboration preferences. to edit his/her preferences about minimum and maximum amount FI is a combination metric that gives equal weight to both precision to edit his/her topics of interest: and recall(Cao Li, 2007: Sarwar et al, 2000): to add new UNESCO codes with its importance degrees, i.e 2×RxP F1= to delete an existing UNESCO code, s to modify the importance degree(2-tuple)assigned to an existing UNESCO code. 4.2. Experiment result Example 3. Going back to the Example 1. let us suppose the user The purpose of the experiments is to test the performance of g wants to update his/her profile because g thinks he/she the proposed recommender system, so we take into account the should belong to the category 3309-Food Technology In this case recommendations made about the research resources. We the user wants to add a new UNESco code and assigns it a consider a data set with 25 research resources of different areas importance degree of High(b3, 0), withb3 E S1): this code is in the collected by the tto experts from different information sources position 118 of the UNESCO codes list and therefore is in about research resources. These resources are included into the position 118 of the vector. system following the indications described above and the system After this the user g has a new profile represented by a new recommends these resources to the suitable users of the ICT vector with the following values area. The system considers that nine resources in all 25 resources are interesting for researchers of the ICT area. There- sy=(b4,0),ify=100 fore the system recommends nine resources to the users. In par- VUW=(b3,0),ify=118 ticular, 10 researchers use our experimental recommender otherwis system and evaluate the relevance of the recommended resources. The contingency table for each one is shown in Table 4.2 4. Experiment and evaluation The corresponding precision, recall and Fl are shown in Table 4.3. The average of precision, recall and Fl metrics are 51. 11% his section presents the evaluation of SIRE2IN, which has been 67.67% and 57.62%, respectively. The Fig. 8 shows a graph with implemented in the Tto of University of Granada. The main focus the precision, recall and FI values for each user. These values re in evaluating the system is to determinate if it fulfills the proposed veals a good performance of the proposed system. bjectives, that is, the recommended information is useful for the users. Now we have implemented a trial version, in which the sys- m works only with few researchers. In a later version we will in- clude the possibility of a free register in the system for all research Contingency table community and companies. To evaluate this primary version of SirE2IN we have designed Not selected experiments in which the proposed system is used to recommend Relevant research resources that best satisfy the preferences of ten users Irrelevant that work in Information and Communication Technologies(ICT) otd
to analyze if the users are researchers or company employees and take into account the users preferences about it. For example, a researcher could want to collaborate only with others researchers of different research group, to calculate the linguistic similarity measure between the users, rlðVUx; VUyÞ, to obtain the compatibility degree between x and y, expressing rlðVUx; VUyÞ as a linguistic label in S3 (using the transformation function defined in Definition 6) to send it to the user. Finally, the system sends to the users of UC the resource information, its calculated linguistic relevance degree and the collaboration possibilities along with a linguistic compatibility degree. The Fig. 7 shows all the process. 3.3.4. Feedback phase This phase is related to the activity developed by the system once the user has taken some of the resources delivered by the system. As we said, user profiles represent the user’s information needs or interests and a desirable property for user profiles is that they should be adaptable since user’s needs could change continuously. Because of this, the system allows users to update their pro- files to improve the filtering process. In our system this feedback process is developed in the following steps: The user accesses the system entering his/her ID and password. The user can do the following operations: – to edit his/her collaboration preferences, – to edit his/her preferences about minimum and maximum amount, – to edit his/her topics of interest: to add new UNESCO codes with its importance degrees, i.e. 2-tuple linguistic ðbi; aiÞ with bi 2 S1 and ai 2 ½:5; :5Þ, to delete an existing UNESCO code, to modify the importance degree (2-tuple) assigned to an existing UNESCO code. Example 3. Going back to the Example 1, let us suppose the user ID wants to update his/her profile because ID thinks he/she should belong to the category 3309 – Food Technology. In this case the user wants to add a new UNESCO code and assigns it a importance degree of High (ðb3; 0Þ; withb3 2 S1); this code is in the position 118 of the UNESCO codes list and therefore is in the position 118 of the vector. After this the user ID has a new profile represented by a new vector with the following values: VUID½y¼ðb4; 0Þ; if y ¼ 100 VUID½y¼ðb3; 0Þ; if y ¼ 118 VUID½y¼ðb0; 0Þ; otherwise: 4. Experiment and evaluation This section presents the evaluation of SIRE2IN, which has been implemented in the TTO of University of Granada. The main focus in evaluating the system is to determinate if it fulfills the proposed objectives, that is, the recommended information is useful for the users. Now we have implemented a trial version, in which the system works only with few researchers. In a later version we will include the possibility of a free register in the system for all research community and companies. To evaluate this primary version of SIRE2IN we have designed experiments in which the proposed system is used to recommend research resources that best satisfy the preferences of ten users that work in Information and Communication Technologies (ICT) area. For a whole evaluation we must include the collaboration recommendations, but in this initial version there aren’t very much users, so we evaluate only the recommendations about research resources. 4.1. Evaluation metrics For the evaluation of recommender systems precision, recall and F1 are measures widely used to evaluate the quality of the recommendations (Cao & Li, 2007; Cleverdon et al., 1966; Sarwar, Karypis, & Konstan, 2000). To calculate these metrics we need a contingency table to categorize the items with respect to the information needs. The items are classified both as relevant or irrelevant and selected (recommended to the user) or not selected. The contingency table (Table 4.1) is created using these four categories. Precision is defined as the ratio of the selected relevant items to the selected items, that is, it measures the probability of a selected item be relevant: P ¼ Nrs Ns Recall is calculated as the ratio of the selected relevant items to the relevant items, that is, it represents the probability of a relevant items be selected: R ¼ Nrs Nr : F1 is a combination metric that gives equal weight to both precision and recall (Cao & Li, 2007; Sarwar et al., 2000): F1 ¼ 2 R P R þ P : 4.2. Experiment result The purpose of the experiments is to test the performance of the proposed recommender system, so we take into account the recommendations made about the research resources. We consider a data set with 25 research resources of different areas collected by the TTO experts from different information sources about research resources. These resources are included into the system following the indications described above and the system recommends these resources to the suitable users of the ICT area. The system considers that nine resources in all 25 resources are interesting for researchers of the ICT area. Therefore the system recommends nine resources to the users. In particular, 10 researchers use our experimental recommender system and evaluate the relevance of the recommended resources. The contingency table for each one is shown in Table 4.2. The corresponding precision, recall and F1 are shown in Table 4.3. The average of precision, recall and F1 metrics are 51.11%, 67.67% and 57.62%, respectively. The Fig. 8 shows a graph with the precision, recall and F1 values for each user. These values reveals a good performance of the proposed system. Table 4.1 Contingency table Selected Not selected Total Relevant Nrs Nrn Nr Irrelevant Nis Nin Ni Total Ns Nn N C. Porcel et al. / Expert Systems with Applications 36 (2009) 5173–5183 5181
C Porcel et al Expert Systems with Applications 36(2009)5173-5183 Experimental contingency table User User9 6 489 Table 4.3 Acknowledgement Detailed experiment result Recall(g) F1(x) This paper has been developed with the financing of SAINFO- 8.82 WEB Project(TICo0602) FUZZYLING Project (TIN2007-61079) 66.67 7059 and PETRI Project(PET2007-0460) User4 67 63.33 References 5.00 Arfi, B (2005). Fuzzy decision making in politics: A linguistic fuzzy-set approach (LFSA) Political Analysis, 13(1), 23-56. 75.00 14 Basu, C Hirsh, H, Cohen, W,(1998) Recommendation as 44 57.1 5000 gs of the 71.43 6250 fifteenth national conference ificial intelligence. (pp. 71 Average 51.1 6767 57.62 Ben-Arieh, D,& Zhifeng, C(2006). Linguistic labels aggreg ons. IEEE n Systems, Man, and Cybernetics Part A- Systems and Humans. nt fuzzy-based recommendation system for 202 Chang, S L Wang R C,& Wang S.Y(2007). Applying a direct multigranularity - Recall inguistic and strategy-oriented aggregation approach on the assessment of Chen, Z,& Ben-Arieh, D (2006) On the fusion of multi-granularity linguistic label ats in group decision making. Computers and Industrial Engineering. 51( 26-54 termining the performance of xing systems. Test results. ASLIB Cranfield research project (VoL. 2) Cordon, O, Herrera, F,& Zwir, L.(2001). Linguistic modelling by hierarchical 20.00 Good.n. sha (1999). Combining collaborative filtering with personal agents for better 0.00 recommendations. In Proceedings of the sixteenth national conference 23 56 7 Hanani, U, Shapira, B, Shoval, P(2001). Information filtering: Overview ofissues Users esearch and systems. User Modeling and User-Adapted Interaction, 11.203- weighted information. IEEE Transactions on Systems, Man and Cybernetics, Part A: Herrera, F, Herrera-Viedma, E,& Martinez, L(2008)A fuzzy linguistic 5. Concluding remarks nethodology to deal with unbalanced linguistic term sets. IEEE Transactions on Systems,16(2)354-3 The exponential increase of Web sites and documents is con- Herrera, F, Hemera v eaman verdegay s, Land e) A inguistigeciszon process tributing to that Internet users not being able to find the informa- Herrera, F, Herrera-Viedma, E, Verdegay. ].L(1996 tion they seek in a simple and timely manner. Because of this, users in group decision making using linguistic OWA operators. Fuzzy Sets and re in need of tools to assist them cope with the large amount of Herrera, F- Herrera-Viedma, E,& Verdegay, ] L(1998). Choice for non- information available on the Web and they receive by email. In this per, we have studied a particular case of information we have presented SiREZIN, a recommender system using both Herrera, F.& Martinez, L(2000). A 2-tuple fuzzy linguistic nformation filtering tools and FLM. The proposed system is ori ted to researchers of the University and environment companie d allows them to obtain automatically information about re- search resources interesting for them. In particular, it is a system tems, Man and Cybernetics. Part B: Cybernetics, 31(2)227-234. based on both content-based filtering tools and the multi-granular Heretaievied syas.( sing M odinag the y tnevas c acess ch anounmormot ithe the information to the fitting users and recommends them:间间 lities. The FLM has been applied in order to mprove the experts-system interaction and researchers-system teraction. Experimental results have shown the useful and effec tiveness of our systems Intemational Journal of Approximate
5. Concluding remarks The exponential increase of Web sites and documents is contributing to that Internet users not being able to find the information they seek in a simple and timely manner. Because of this, users are in need of tools to assist them cope with the large amount of information available on the Web and they receive by email. In this paper, we have studied a particular case of information access and we have presented SIRE2IN, a recommender system using both information filtering tools and FLM. The proposed system is oriented to researchers of the University and environment companies and allows them to obtain automatically information about research resources interesting for them. In particular, it is a system based on both content-based filtering tools and the multi-granular FLM. The system filters the incoming information stream to spread the information to the fitting users and recommends them about collaboration possibilities. The FLM has been applied in order to improve the experts-system interaction and researchers-system interaction. Experimental results have shown the useful and effectiveness of our systems. Acknowledgement This paper has been developed with the financing of SAINFOWEB Project (TIC00602), FUZZYLING Project (TIN2007-61079) and PETRI Project (PET2007-0460). References Arfi, B. (2005). Fuzzy decision making in politics: A linguistic fuzzy-set approach (LFSA). Political Analysis, 13(1), 23–56. Basu, C., Hirsh, H., & Cohen, W., (1998). Recommendation as classification: Using social and content-based information in recommendation. In Proceedings of the fifteenth national conference on artificial intelligence. (pp. 714–720). Ben-Arieh, D., & Zhifeng, C. (2006). Linguistic labels aggregation and consensus measure for autocratic decision-making using group recommendations. IEEE Transactions on Systems, Man, and Cybernetics Part A – Systems and Humans, 36(3), 558–568. Cao, Y., & Li, Y. (2007). An intelligent fuzzy-based recommendation system for consumer electronic products. Expert Systems with Applications, 33, 230–240. Chang, S. L., Wang, R. C., & Wang, S. Y. (2007). Applying a direct multigranularity linguistic and strategy-oriented aggregation approach on the assessment of supply performance. European Journal of Operational Research, 177(2), 1013–1025. Chen, Z., & Ben-Arieh, D. (2006). On the fusion of multi-granularity linguistic label sets in group decision making. Computers and Industrial Engineering, 51(3), 526–541. Claypool, M., Gokhale, A., & Miranda, T., (1999). Combining content-based and collaborative filters in an online newpaper. In Proceedings of the ACM SIGIR workshop on recommender systems-implementation and evaluation. Cleverdon, C. W., & Keen, E. M. (1966). Factors determining the performance of indexing systems. Test results. ASLIB Cranfield research project (Vol. 2). Bedford, England: Cranfield. Cordón, O., Herrera, F., & Zwir, I. (2001). Linguistic modelling by hierarchical systems of linguistic rules. IEEE Transactions on Fuzzy Systems, 10(1), 2–20. Good, N., Shafer, J. B., Konstan, J. A., Borchers, A., Sarwar, B. M., Herlocker, J. L., et al. (1999). Combining collaborative filtering with personal agents for better recommendations. In Proceedings of the sixteenth national conference on artificial intelligence (pp. 439–446). Hanani, U., Shapira, B., & Shoval, P. (2001). Information filtering: Overview of issues, research and systems. User Modeling and User-Adapted Interaction, 11, 203– 259. Herrera, F., & Herrera-Viedma, E. (1997). Aggregation operators for linguistic weighted information. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems, 27, 646–656. Herrera, F., Herrera-Viedma, E., & Martı´nez, L. (2008). A fuzzy linguistic methodology to deal with unbalanced linguistic term sets. IEEE Transactions on Fuzzy Systems, 16(2), 354–370. Herrera, F., Herrera-Viedma, E., & Verdegay, J. L. (1996). A linguistic decision process in group decision making. Group Decision and Negotiation, 5, 165–176. Herrera, F., Herrera-Viedma, E., & Verdegay, J. L. (1996). Direct approach processes in group decision making using linguistic OWA operators. Fuzzy Sets and Systems, 79, 175–190. Herrera, F., Herrera-Viedma, E., & Verdegay, J. L. (1998). Choice processes for nonhomogeneous group decision making in linguistic setting. Fuzzy Sets and Systems, 94, 287–308. Herrera, F., & Martı´nez, L. (2000). A 2-tuple fuzzy linguistic representation model for computing with words. IEEE Transactions on Fuzzy Systems, 8(6), 746–752. Herrera, F., & Martı´nez, L. (2001). A model based on linguistic 2-tuples for dealing with multigranularity hierarchical linguistic contexts in multiexpert decisionmaking. IEEE Transactions on Systems, Man and Cybernetics. Part B: Cybernetics, 31(2), 227–234. Herrera-Viedma, E. (2001). Modeling the retrieval process of an information retrieval system using an ordinal fuzzy linguistic approach. Journal of the American Society for Information Science and Technology, 52(6), 460–475. Herrera-Viedma, E. (2001). An information retrieval system with ordinal linguistic weighted queries based on two weighting elements. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9, 77–88. Herrera-Viedma, E., Cordón, O., Luque, M., López, A. G., & Muñoz, A. M. (2003). A model of fuzzy linguistic IRS based on multi-granular linguistic information. International Journal of Approximate Reasoning, 34(3), 221–239. Table 4.2 Experimental contingency table User1 User2 User3 User4 User5 User6 User7 User8 User9 User10 Nrs5 6 3 4 5 6 5 3 4 5 Nrn3 2 2 2 3 2 2 1 3 2 Nis4 3 6 5 4 3 4 6 5 4 Nr8 8 5 6 8 8 7 4 7 7 Ns9 9 9 9 9 9 9 9 9 9 Table 4.3 Detailed experiment result Precision (%) Recall (%) F1 (%) User1 55.56 62.50 58.82 User2 66.67 75.00 70.59 User3 33.33 60.00 42.86 User4 44.44 66.67 53.33 User5 55.56 62.50 58.82 User6 66.67 75.00 70.59 User7 55.56 71.43 62.50 User8 33.33 75.00 46.15 User9 44.44 57.14 50.00 User10 55.56 71.43 62.50 Average 51.11 67.67 57.62 Fig. 8. Experiment result. 5182 C. Porcel et al. / Expert Systems with Applications 36 (2009) 5173–5183