NTERNATIONAL JOURNAL OF COMPUTERS ssue 3. Volume 5. 2011 Semantic Search Itinerary recommender System LIVIU ADRIAN COTFAS ANDREEA DIOSTEANU STEFAN DANIEL DUMITRESCU ALEXANDRU SMEUREANU Abstract- In this paner we present a novel aproach based on users to schedule th:时 Natural Language Processing and hybrid multi-objective genetic it during the trip algorithms for developing mobile tourism itinerary recommender Furthermore, another important research trend in the systems. The proposed semantic matching technique allows users to find Points of Interest- POls that match highly specific preferences integration of semantic technologies in many fields as they Moreover, it can also be used to further filter results from traditional enhance the process of knowledge retrieval recommender techniques, such as collaborative filtering, and only structuring. Paper [l] presents for example a Fuzzy-Neural requires a minimal initial input from the user to generate relevant Network (FNN) model to construct Q&a knowledge base recommendations. A hybrid multi-objective genetic algorithm has automatically. In this paper we use an approach based on been developed in order to allow the tourists to easily choose document indexing and semantic search in order to allow users between several Pareto optimal itineraries time. Furthermore, the proposed system is easy to use, thus it can be puted in near real- to easily find interesting tourist attractions ated that our solution is both complex and at the same time user Recommender systems also represent a trend of great interest for the current research. They consist of software tools and techniques providing suggestions for items to be of use to Keywords- semantic similarity, free text document indexing, a user[2]. In order to determine the most suitable suggestions, multi-objective genetic algorithm, itinerary recommender system most implementations rely on collaborative filtering and content-based methods. However, these solutions individually INTRODUCTION fail to provide good solutions in many situations. In order to Nowadays,designing flexible, efficient and user-friendly overcome these failures, 3] proposes an alternative method to mobile tour guide applications is of a great interest both content-based filtering based on neural networks. The article from a commercial and research point of view. Such systems presents a framework used for designing recommender are useful for tourists visiting a location in a limited period of systems for combining neural networks and collaborative time. Without any support from a system, a user manually filtering. Recommender Systems also have building an itinerary, should spend a lot of time prior to the applications in a wide range of domains from education and trip, searching for information regarding the tourist attractions, training to economics [4], or tourism as it is the case of the reading facts about them and designing possible itineraries by application presented in this paper taking into account factors such as opening hours, visiting One of the first attempts of applying recommender systems durations, distances and available means of public transport. in tourism is represented by the Cyberguide [5] project, in which the authors develop a mobile application targeted to model the tour guide activity. The paper also acknowledges ire the resuit of the importance of context, but limited only to location novation triangle" This project is co funded by European Social Fund awareness. The authors present the overall architecture and the through The Sectorial Operational Program for Human Resources development procedures for multiple different hand-held Development 2007-2013, coordinated by The Bucharest Academy of platforms. Moreover, the general research issues that have the education research and innovation triangle, DOC-ECr"). Parts of this emerged in context-aware applications development in a article are the result of posdru. financing contract POSDRU/6/1. 5/S/19 mobile environment are discusse L A. Cotfas is with the Economic Informatics Department, Bucharest cademy of Economic Studies, Bucharest, Romania(phone: 40-772-402-305 ides poi recommendations based on the user's profile and context. The A. Diosteanu is with the Economic Informatics Department, authors also present an evaluation of the visitor's experience with the system. A drawback of the approach is represented by S D. Dumitrescu is with the Computers and Information the amount of information required from the user in order to Department, Politehnica University of Bucharest, Bucharest, Romania (e- customize the tour. Guide offers users the possibility of ail: dumitrescu. stefan gmail. com). creating a tour by selecting objectives, presenting their the economic Academy of Economic Studies domains of interest and preferences. The restrictions tha alexandru. smeureanu a gmail. com). system takes into account refer to the schedule of objectives
Abstract— In this paper we present a novel approach based on Natural Language Processing and hybrid multi-objective genetic algorithms for developing mobile tourism itinerary recommender systems. The proposed semantic matching technique allows users to find Points of Interest – POIs that match highly specific preferences. Moreover, it can also be used to further filter results from traditional recommender techniques, such as collaborative filtering, and only requires a minimal initial input from the user to generate relevant recommendations. A hybrid multi-objective genetic algorithm has been developed in order to allow the tourists to easily choose between several Pareto optimal itineraries computed in near realtime. Furthermore, the proposed system is easy to use, thus it can be stated that our solution is both complex and at the same time useroriented. Keywords— semantic similarity, free text document indexing, multi-objective genetic algorithm, itinerary recommender system. INTRODUCTION owadays, designing flexible, efficient and user-friendly mobile tour guide applications is of a great interest both from a commercial and research point of view. Such systems are useful for tourists visiting a location in a limited period of time. Without any support from a system, a user manually building an itinerary, should spend a lot of time prior to the trip, searching for information regarding the tourist attractions, reading facts about them and designing possible itineraries by taking into account factors such as opening hours, visiting durations, distances and available means of public transport. Manuscript received April 12, 2011. Parts of this article are the result of the project „Doctoral Program and PhD Students in the education research and innovation triangle”. This project is co funded by European Social Fund through The Sectorial Operational Program for Human Resources Development 2007-2013, coordinated by The Bucharest Academy of Economic Studies (Project no. 7832, “Doctoral Program and PhD Students in the education research and innovation triangle, DOC-ECI”). Parts of this article are the result of POSDRU, financing contract POSDRU/6/1.5/S/19. L.A. Cotfas is with the Economic Informatics Department, Bucharest Academy of Economic Studies, Bucharest, Romania (phone: 40-772-402-305 e-mail: liviu.cotfas@gmail.com). A. Dioșteanu is with the Economic Informatics Department, Bucharest Academy of Economic Studies, Bucharest, Romania (e-mail: andreea.diosteanu@gmail.com). S.D. Dumitrescu is with the Computers and Information Technology Department, Politehnica University of Bucharest, Bucharest, Romania (email: dumitrescu.stefan@gmail.com). A. Smeureanu is with the Economic Informatics Department, Bucharest Academy of Economic Studies, Bucharest, Romania (e-mail: alexandru.smeureanu@gmail.com). Conversely, itinerary recommender systems not only assist users to schedule the initial itinerary, but can also easily revise it during the trip. Furthermore, another important research trend in the integration of semantic technologies in many fields as they enhance the process of knowledge retrieval, processing and structuring. Paper [1] presents for example a Fuzzy-Neural Network (FNN) model to construct Q&A knowledge base automatically. In this paper we use an approach based on document indexing and semantic search in order to allow users to easily find interesting tourist attractions. Recommender systems also represent a trend of great interest for the current research. They consist of software tools and techniques providing suggestions for items to be of use to a user [2]. In order to determine the most suitable suggestions, most implementations rely on collaborative filtering and content-based methods. However, these solutions individually fail to provide good solutions in many situations. In order to overcome these failures, [3] proposes an alternative method to content-based filtering based on neural networks. The article presents a framework used for designing recommender systems for combining neural networks and collaborative filtering. Recommender Systems also have a lot of applications in a wide range of domains from education and training to economics [4], or tourism as it is the case of the application presented in this paper. One of the first attempts of applying recommender systems in tourism is represented by the Cyberguide [5] project, in which the authors develop a mobile application targeted to model the tour guide activity. The paper also acknowledges the importance of context, but limited only to location awareness. The authors present the overall architecture and the development procedures for multiple different hand‐held platforms. Moreover, the general research issues that have emerged in context‐aware applications development in a mobile environment are discussed. The Guide system presented in [6] provides POI recommendations based on the user’s profile and context. The authors also present an evaluation of the visitor’s experience with the system. A drawback of the approach is represented by the amount of information required from the user in order to customize the tour. Guide offers users the possibility of creating a tour by selecting objectives, presenting their domains of interest and preferences. The restrictions that the system takes into account refer to the schedule of objectives, Semantic Search Itinerary Recommender System LIVIU ADRIAN COTFAS, ANDREEA DIOSTEANU, STEFAN DANIEL DUMITRESCU, ALEXANDRU SMEUREANU N INTERNATIONAL JOURNAL OF COMPUTERS Issue 3, Volume 5, 2011 370
NTERNATIONAL JOURNAL OF COMPUTERS ssue 3. Volume 5. 2011 ntervals. The to be change 1.r dynamically, based on the actual time the tourist already spent attractio at some attractions or based on the weather conditions For example, if the weather is inappropriate for visiting open-air points of interest, than the system automatically suggests indoors activities). The user profile is also continuously 2. Build the itinerary updated, using information from previous trips. [7 presents an artificial intelligence and metaheuristic approach for dealing with the tourist planning issue on mobile devices. The solution presented offers a decision support and a modeling mechanism 3. Guidance during the tour starting from the orienteering problem. The problem involves Fig. I The main functionalities of itinerary recommender systems set of candidate objectives for which a score is being associated. The problem is thus reduced to maximizing the Creating the list of POl: In the simpler approaches like total score of the visited places, while complying with the the one presented in P-Tour [101, the user is requested to constraints related to the total amount of time or the total select the desired PoI from a list without receiving any distance. The score of a point of Interest - Poi is defined as recommendations from the system. On the opposite side, the interest of a tourist in that place. Scores are calculated other systems [11 opt for a completely automatic Ising the vector space model. The trip planning problem is approach in which the user has no control over the solved using a guided local search metaheuristic. In order to selected Points of Interest -POl. mTripAssistent compare the performance of this approach with an algorithm combines the manual and automatic approaches for that appeared in the literature, both are applied to a real data building the list of Pol, by both allowing the user to set from the city of Ghent manually select locations and using an algorithm to Our approach is also based on semantic search as [7] choose the best locations for the remaining time uses neuronal network based document indexing mechanisms. Building the itinerary: Building tourist itineraries is an Furthermore, we present the user a list of possible points of extension of the Traveling Salesmen Problem which is interest as [6], but we dynamically create the tour and we known to be already NP-hard. Existing approaches either don't require data introduction effort from the user. Moreover only try to maximize the sum of scores associated to the our approach also generates public transport routes to link visited POls [Il], or if multiple criteria are taken into objectives and also offers information regarding the pollution onsideration, they are combined using a simple level from the areas in which pollution sensors are available weighted sum. [4] presents an approach based on using The trip planner presented in this paper is tested in Bucharest weighted sum that takes into consideration the preference expressed by the user and the distance between the This paper is organized as follows: in the second section we Points of Interest. mTripAssistent uses a novel multi make a short overall presentation of the main characteristics of objective algorithm that allows finding several Pareto- the proposed solution and we highlight the general features of optimal solutions in near real-time he recommender systems, in the third part we present the Guidance during the tour: As it was shown in the user semantic technologies and algorithm applied for semantic studies performed by [ 12], tourists typically modify the search and document indexing by using key words, in the list of Pol during the itinerary. Therefore mTripAssistent fourth section we formalize the model used for generating allows the user to easily change the itinerary at any itineraries, while the fifth section enlarges upon the overall moment during the tour. distributed architecture of our web service based platform, and As it can be seen from the features presented above the in the sixth section we present the entire itinerary generation novelty of our approach resides in the fact that it requires flow and the scenario, in addition we also introduce the mobile less input and effort from the user, is flexible and dynamic itinerary recommender application. The last section will by applying the multi-objective algorithm a wider set of conclude the article and present the future research guidelines. constraints and rules can be applied. Furthermore the proposed solution also recommends public transport ITINERARY RECOMMENDER SYSTEMS routes In order to offer users the possibility of flexibly selecting the itinerary based on their preferences, we propose an DOCUMENT INDEXING FOR EXTRACTING CANDIDATE POIS algorithm for indexing documents and a flexible semantic In order to determine the possible candidate points of search mechanism interest, we developed a semantic search algorithm that has Several reviews of existing itinerary recommender solutions two main steps: the semantic classification of the description can be found in [8] and [9]. In most approaches, the process of files of POi and also a document indexing step. Afterwards we generating itineraries and guiding users can be divided in three perform the semantic matching between the indexed words distinct steps, which also represent the main functionalities of and the search phrases introduced by the user. An approach for itinerary recommender systems(Fig 1.) key words extraction based on WordNet is also presented in
the best visiting intervals. The tour can be changed dynamically, based on the actual time the tourist already spent at some attractions or based on the weather conditions. For example, if the weather is inappropriate for visiting open-air points of interest, than the system automatically suggests indoors activities). The user profile is also continuously updated, using information from previous trips. [7] presents an artificial intelligence and metaheuristic approach for dealing with the tourist planning issue on mobile devices. The solution presented offers a decision support and a modeling mechanism starting from the orienteering problem. The problem involves a set of candidate objectives for which a score is being associated. The problem is thus reduced to maximizing the total score of the visited places, while complying with the constraints related to the total amount of time or the total distance. The score of a Point of Interest - POI is defined as the interest of a tourist in that place. Scores are calculated using the vector space model. The trip planning problem is solved using a guided local search metaheuristic. In order to compare the performance of this approach with an algorithm that appeared in the literature, both are applied to a real data set from the city of Ghent. Our approach is also based on semantic search as [7] but uses neuronal network based document indexing mechanisms. Furthermore, we present the user a list of possible points of interest as [6], but we dynamically create the tour and we don’t require data introduction effort from the user. Moreover, our approach also generates public transport routes to link objectives and also offers information regarding the pollution level from the areas in which pollution sensors are available. The trip planner presented in this paper is tested in Bucharest city. This paper is organized as follows: in the second section we make a short overall presentation of the main characteristics of the proposed solution and we highlight the general features of the recommender systems, in the third part we present the semantic technologies and algorithm applied for semantic search and document indexing by using keywords, in the fourth section we formalize the model used for generating itineraries, while the fifth section enlarges upon the overall distributed architecture of our web service based platform, and in the sixth section we present the entire itinerary generation flow and the scenario, in addition we also introduce the mobile itinerary recommender application. The last section will conclude the article and present the future research guidelines. ITINERARY RECOMMENDER SYSTEMS In order to offer users the possibility of flexibly selecting the itinerary based on their preferences, we propose an algorithm for indexing documents and a flexible semantic search mechanism. Several reviews of existing itinerary recommender solutions can be found in [8] and [9]. In most approaches, the process of generating itineraries and guiding users can be divided in three distinct steps, which also represent the main functionalities of itinerary recommender systems (Fig 1.): Fig. 1 The main functionalities of itinerary recommender systems Creating the list of POI: In the simpler approaches like the one presented in P-Tour [10], the user is requested to select the desired POI from a list without receiving any recommendations from the system. On the opposite side, other systems [11] opt for a completely automatic approach in which the user has no control over the selected Points of Interest – POI. mTripAssistent combines the manual and automatic approaches for building the list of POI, by both allowing the user to manually select locations and using an algorithm to choose the best locations for the remaining time. Building the itinerary: Building tourist itineraries is an extension of the Traveling Salesmen Problem which is known to be already NP-hard. Existing approaches either only try to maximize the sum of scores associated to the visited POIs [11], or if multiple criteria are taken into consideration, they are combined using a simple weighted sum. [4] presents an approach based on using weighted sum that takes into consideration the preference expressed by the user and the distance between the Points of Interest. mTripAssistent uses a novel multiobjective algorithm that allows finding several Paretooptimal solutions in near real-time. Guidance during the tour: As it was shown in the user studies performed by [12], tourists typically modify the list of POI during the itinerary. Therefore mTripAssistent allows the user to easily change the itinerary at any moment during the tour. As it can be seen from the features presented above the novelty of our approach resides in the fact that it requires less input and effort from the user, is flexible and dynamicby applying the multi-objective algorithm a wider set of constraints and rules can be applied. Furthermore the proposed solution also recommends public transport routes. DOCUMENT INDEXING FOR EXTRACTING CANDIDATE POIS In order to determine the possible candidate points of interest, we developed a semantic search algorithm that has two main steps: the semantic classification of the description files of POI and also a document indexing step. Afterwards we perform the semantic matching between the indexed words and the search phrases introduced by the user. An approach for key words extraction based on WordNet is also presented in INTERNATIONAL JOURNAL OF COMPUTERS Issue 3, Volume 5, 2011 371
NTERNATIONAL JOURNAL OF COMPUTERS ssue 3. Volume 5. 2011 [13]. The article uses key sentences for automatically The artificial network model can be seen in Fig. 2 summarizing by applying statistical methods. Our In order to train the neuronal network we used the back paper uses a neuronal network approach for identifying propagation algorithm implemented in Weka[20] on a set of manually classified instances. For the training set we have Document indexing is a useful in a lot of Natural Language chosen documents from different sources, with a focus on ocessing application such as text mining, knowledge newspaper articles. We created a java application that retrieval, keywords extraction, etc [14]. Indexing defines the automatically extracts words in the document by following the process of converting a document into a list of words included steps presented above and computes the values for them. We in it. The result of indexing is represented by a list of words then manually classify the extracted potential keywords into called index language [151 two gro eywords and non-keywords. We use the The selection process of document indexing has the resulting set of classified instances to train the ANN following steps Step 1: Eliminate stop words. Stop words are words which have grammatical functionality, but are not related to the ontent of the document. Some possible stop words are conjunction, prepositions, auxiliary verbs, articles, pronouns Step 2: For the remaining of the document words in the list perform word stemming. This means that we extract the root of the words. In order to achieve this we use porter stemmer for English language &.Step 3: This step consists in performing a neuronal network ased filtering so that to extract only those words that are relevant to the context The input nodes for the neural network are represented by the following features that can be grouped into statistic features(TF, IDF, ITF)[15], position features (T, Fpos)[16] grammatical syntax(subject, object, predicate(SPO)) term frequency (TF)- the frequency of each term in the Fig 2 Artificial neuronal network for extracting relevant words document inverted document frequency(DF)the number of We tested the artificial neuronal network on a set of 50 documents that contain the word from the initial English documents. From the set of 3850 identified possible document sample keywords, only 10% were selected in the manual classification IDF=-Log2 Tol where Tot-total occurrence [17] ohase. In order to choose the best configuration for the aNN several choices for the number of neurons in the hidden layer inverted total frequency(ITF)-total frequency of the word have been tested in the initial document sample Below, we present the comparison results for 3 ANN T-Boolean value indicating whether the word appears in configurations the title or not (no attributes+no classes) with the result pos=f where i-the number of the sentence in which the word appears for the first time Confusion Matrix -(43 342)a 293436b SPO-Boolean value indicating whether the word is a where ibject, predicate or an object in the sentences from the first column(a-is keyword first paragraph For determining the grammatical category second(b)-non-keyword of a word we used Gate and we integrated the Stanford Correctly Classified Instance 3479 90.3636% Therefore, the neuronal network will have six input nodes Other characteristics of the ANn are: one hidden layer Incorrectly Classified Instances.6364% usually it is not indicated to use more than one hidden layer he number of hidden nodes is given by the formula Nn= no attributes no classes, with the results (no attributes + no classes) Confusion Matrix =(44 341 a 453420丿b no. attributes-represents the number of features (in our case noclasses -represents the number of classes in which words Correctly Classified Instance 346489.974% can be classified (in our case 2: keyword and non-keyword)
[13]. The article uses key sentences for automatically summarizing documents by applying statistical methods. Our paper uses a neuronal network approach for identifying keywords. Document indexing is a useful in a lot of Natural Language Processing application such as text mining, knowledge retrieval, keywords extraction, etc [14]. Indexing defines the process of converting a document into a list of words included in it. The result of indexing is represented by a list of words called index language [15]. The selection process of document indexing has the following steps: Step 1: Eliminate stop words. Stop words are words which have grammatical functionality, but are not related to the content of the document. Some possible stop words are: conjunction, prepositions, auxiliary verbs, articles, pronouns, etc. Step 2: For the remaining of the document words in the list perform word stemming. This means that we extract the root of the words. In order to achieve this, we use Porter stemmer for English language. Step 3: This step consists in performing a neuronal network based filtering so that to extract only those words that are relevant to the context. The input nodes for the neural network are represented by the following features that can be grouped into statistical features (TF, IDF, ITF) [15], position features (T, Fpos) [16], grammatical syntax (subject, object, predicate (SPO)): term frequency(TF)- the frequency of each term in the document; inverted document frequency(IDF)-the number of documents that contain the word from the initial document sample; IDF= , where - total occurrence [17] inverted total frequency(ITF)-total frequency of the word in the initial document sample; T-Boolean value indicating whether the word appears in the title or not; Fpos= √ where i-the number of the sentence in which the word appears for the first time SPO-Boolean value indicating whether the word is a subject, predicate or an object in the sentences from the first paragraph. For determining the grammatical category of a word we used Gate and we integrated the Stanford Parser [18][19]. Therefore, the neuronal network will have six input nodes. Other characteristics of the ANN are: one hidden layer, usually it is not indicated to use more than one hidden layer; the number of hidden nodes is given by the formula: where: - represents the number of features (in our case equals six) - represents the number of classes in which words can be classified (in our case 2: keyword and non-keyword). The artificial network model can be seen in Fig. 2. In order to train the neuronal network we used the back propagation algorithm implemented in Weka [20] on a set of manually classified instances. For the training set we have chosen documents from different sources, with a focus on newspaper articles. We created a java application that automatically extracts words in the document by following the steps presented above and computes the values for them. We then manually classify the extracted potential keywords into two groups: keywords and non-keywords. We use the resulting set of classified instances to train the ANN. Fig. 2 Artificial neuronal network for extracting relevant words We tested the artificial neuronal network on a set of 50 English documents. From the set of 3850 identified possible keywords, only 10% were selected in the manual classification phase. In order to choose the best configuration for the ANN, several choices for the number of neurons in the hidden layer have been tested. Below, we present the comparison results for 3 ANN configurations: - with the results: Confusion Matrix =( ) where: first column(a) - is keyword second(b) - non-keyword Correctly Classified Instance 3479 90.3636 % Incorrectly Classified Instances 371 9.6364 % - , with the results: Confusion Matrix =( ) Correctly Classified Instance 3464 89.974% INTERNATIONAL JOURNAL OF COMPUTERS Issue 3, Volume 5, 2011 372
NTERNATIONAL JOURNAL OF COMPUTERS ssue 3. Volume 5. 2011 incorrectly Classified Instances 38610.026% where SimLin(X1, X2)-the level of similarity between concepts 1 =10. with the results and 2 computed by Lin formula Confusion MatrIx< /45 340 S(X1, X2)the degree of shared information for X, and X2 493416 Correctly Classified Instances 346189.8961% IC(X)-information content (IC)of X concept Incorrectly Classified Instances.1039% The semantic matching evaluation will performed gainst the user key search phrases and the key words extracted We also compared the results with the Naive Bayesian from the do For instance. if a tourist introduces"modern art' as search Confusion matrix phrase, and in a document about a military museum it is 3437 specified that there is a painting created by a modern art ainter"Paul Gauguin", then the POI will be suggested to the Correctly Classified Instances 3472 90. 1818% Incorrectly Classified Instances37898182% MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM FOR GENERATING ITINERARIES We conclude that the ANN configuration initially presented Besides the semantic search approach described in the ives us better results previous section, users can also choose POl from a suggestion First in order to understand the sense of the words as it is importance, by changing the order g the selected POIs are After the document indexing phase is completed, we can list built using collaborative filtering. All displayed in a common list, allowin change their used in the documents we perform word disambiguation. For We frame the problem of building tourist itineraries as an this, we query an external corpus, YAGO [21], based on extension of the classic Orienteering Problem -OP, also wikipedia and WordNet that contains both word definitions known as the Time Dependent Orienteering Problem with and the semantic relations with other words in the same Time Windows-TDOPTW. As the Orienteering Problem is synonymic class. For word sense disambiguation we identify known to be NP-hard, finding an exact solution for the all possible senses of the words that were previously POS- TDOPTW in near real-time is considered to require a high Parts of Speech tagged. For each meaning of the word we computational power[22]. The proposed system uses a hybrid identify all possible solutions multi-objective genetic algorithm in order to generate Its own definition that includes example texts that itineraries in a short amount of time. Genetic algorithms are a WordNet or other corpus provides to the glosses particular class of evolutionary algorithms that use techniques The dictionary definition of the synonymic groups that are inspired by evolutionary biology such as inheritance connected to it through the "has a'"relations. If there is mutation, selection, and crossover. In order to decrease the more than one relation for a word sense, then the glosses number of required generations, we use the following for each relation are concatenated into a single gloss heuristic: each time we add a new POl to an itinerary we choose the one with the highest ratio between the associated The dictionary definition of the synonymic group that are score, t and the time required to visit the POl connected to it through the"is a"relations The dictionary definition of the synonymic group that are n Is-the start time connected to it through the"part of relations T-the finish time After these steps are taken, the semantic similarity of the synonymic group the words belong to is measured. If a word LF-the finish location has more than one sense, it will appear in multiple synonymic POlselected the set of manually selected tourist attractions that will be included in all groups at various locations in the classification. WordNet defines relations between synonymic groups and relations between word senses. To measure the semantic similarity n POlFavorites -the set of attractions that were marked as favorites by the user between two synonymic groups, the most appropriate relations POl -the global set of tourist attractions that are to be used are the semantic hierarchic ones. Several NSoL-the number of requested solutions types of semantic similarity formulas exist: Conrath, Lin, The output of the algorithm consists in a number of Resnik. In our approach, Lin formula was chosen as it fully Nsol s N:ol Pareto-optimal itineraries including both th takes into account the information content tourist attractions and the traveling directions between them 2*S(X1,X2) The considered optimization criteria are (C(X1)+lC(X2)) Cl-represents the useful visiting time calculated as shown in(2)
Incorrectly Classified Instances 386 10.026% - , with the results: Confusion Matrix =( ) Correctly Classified Instances 3461 89.8961 % Incorrectly Classified Instances 389 10.1039 % - We also compared the results with the Naive Bayesian Classifier: Confusion Matrix =( ) Correctly Classified Instances 3472 90.1818 % Incorrectly Classified Instances 378 9.8182 % We conclude that the ANN configuration initially presented gives us better results. After the document indexing phase is completed, we can perform the semantic matching. First in order to understand the sense of the words as it is used in the documents we perform word disambiguation. For this, we query an external corpus, YAGO [21], based on Wikipedia and WordNet that contains both word definitions and the semantic relations with other words in the same synonymic class. For word sense disambiguation we identify all possible senses of the words that were previously POS – Parts of Speech tagged. For each meaning of the word we identify all possible solutions: Its own definition that includes example texts that WordNet or other corpus provides to the glosses. The dictionary definition of the synonymic groups that are connected to it through the “has a” relations. If there is more than one relation for a word sense, then the glosses for each relation are concatenated into a single gloss string. The dictionary definition of the synonymic group that are connected to it through the “is a” relations. The dictionary definition of the synonymic group that are connected to it through the “part of” relations. After these steps are taken, the semantic similarity of the synonymic group the words belong to is measured. If a word has more than one sense, it will appear in multiple synonymic groups at various locations in the classification. WordNet defines relations between synonymic groups and relations between word senses. To measure the semantic similarity between two synonymic groups, the most appropriate relations that are to be used are the semantic hierarchic ones. Several types of semantic similarity formulas exist: Conrath, Lin, Resnik. In our approach, Lin formula was chosen as it fully takes into account the information content: (1) where: the level of similarity between concepts 1 and 2 computed by Lin formula - the degree of shared information for and concepts X1, X2 - concepts 1 and 2 IC(X) - information content (IC) of X concept. The semantic matching evaluation will be performed against the user key search phrases and the keywords extracted from the documents. For instance, if a tourist introduces “modern art” as search phrase, and in a document about a military museum it is specified that there is a painting created by a modern art painter “Paul Gauguin”, then the POI will be suggested to the user. MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM FOR GENERATING ITINERARIES Besides the semantic search approach described in the previous section, users can also choose POI from a suggestion list built using collaborative filtering. All selected POIs are displayed in a common list, allowing the user to change their importance, by changing the order. We frame the problem of building tourist itineraries as an extension of the classic Orienteering Problem - OP, also known as the Time Dependent Orienteering Problem with Time Windows - TDOPTW. As the Orienteering Problem is known to be NP-hard, finding an exact solution for the TDOPTW in near real-time is considered to require a high computational power [22]. The proposed system uses a hybrid multi-objective genetic algorithm in order to generate itineraries in a short amount of time. Genetic algorithms are a particular class of evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover. In order to decrease the number of required generations, we use the following heuristic: each time we add a new POI to an itinerary we choose the one with the highest ratio between the associated score, and the time required to visit the POI. The itinerary building algorithm has as inputs: - the start time; - the finish time; - the start location; - the finish location; - the set of manually selected tourist attractions that will be included in all generated itineraries; - the set of attractions that were marked as favorites by the user; - the global set of tourist attractions; - the number of requested solutions. The output of the algorithm consists in a number of Pareto-optimal itineraries including both the tourist attractions and the traveling directions between them. The considered optimization criteria are: C1 - represents the useful visiting time calculated as shown in (2); INTERNATIONAL JOURNAL OF COMPUTERS Issue 3, Volume 5, 2011 373
NTERNATIONAL JOURNAL OF COMPUTERS ssue 3. Volume 5. 2011 C2- represents the average score of the visited valid. First, the user selected tourist attractions, POlselected are cations calculated as shown in(3) dded to the new itinerary in a random manner C3-represents the diversity of the itinerary calculated as shown in(4) S1: Initialization Other criteria can also easily be taken into consideration The criteria C3 can also take into consideration the structure f the itinerary 52: Determine pareto fronts Visitt S3: Selection MR=②S)/n Generate the new population S4: Mutation Crossover DV=nj The start and finish locations, Ls and LF, represent the GPS Best itineraries coordinates associated to the locations selected by the user either from the map or from the list of tourist attractions Fig. 3 Itinerary building algorithm Each Point of Interest i= 1.NpoI is characterized by Oi-opening time, Afterwards, the remaining time is completed with tourist tractions selected from the set of favorites, POlFavorites and Si-associated score, from the normal set, POl, A tourist attraction i can only be Feel-entrance fee inserted if the arrival time, Ai, satisfies the conditions in(6) VisitDi -medium visit duration and(7) Loci-GPS coordinates For each user u and tourist attraction i, visitDi represents Ai Np/2 we use a Po,with a number of Np itineraries called chromosomes. The clustering approach to select Np/2 solutions as different population of non-dominated solutions is also initialized Eo=0. In order to limit the number of required algorithm Mutation: Adds random variation to the evolving iterations,all itineraries in the initial population are generated population. Two links are randomly chosen from the itinerary
C2 - represents the average score of the visited locations calculated as shown in (3); C3 - represents the diversity of the itinerary calculated as shown in (4). Other criteria can also easily be taken into consideration. The criteria C3 can also take into consideration the structure of the itinerary. ∑ (2) ∑ (3) (4) The start and finish locations, and , represent the GPS coordinates associated to the locations selected by the user either from the map or from the list of tourist attractions. Each Point of Interest is characterized by: - opening time; - closing time; - associated score; - entrance fee; - medium visit duration; - GPS coordinates. For each user and tourist attraction , represents the real visiting duration and the predicted one. For a tourist attraction that has not yet been visited, the estimated visiting duration is computed as shown in (5), where represents the number of tourist attractions in the category to which the tourist attraction belongs that have already been visited by the user √∏ (5) The algorithm is run until a maximum number of generations, is reached or until the best found solution doesn’t change for consecutive generations. Depending on the purpose can be adjusted either for accuracy or speed. In order to increase the performance with compute in advance the minimum - and maximum time - required to travel between locations i and j. The values can be calculated for different hours or week days. We have chosen an Elitist approach [13] in which we use two populations, the normal one and the population of nondominated solutions in generation , . Initialization: For , we create the initial population , with a number of itineraries called chromosomes. The population of non-dominated solutions is also initialized . In order to limit the number of required algorithm iterations, all itineraries in the initial population are generated valid. First, the user selected tourist attractions, are added to the new itinerary in a random manner. S1: Initialization S2: Determine Pareto fronts S3: Selection S4: Mutation S5: Crossover Loop condition Best itineraries Generate the new population Fig. 3 Itinerary building algorithm Afterwards, the remaining time is completed with tourist attractions selected from the set of favorites, and from the normal set, . A tourist attraction can only be inserted if the arrival time, , satisfies the conditions in (6) and (7). (6) (7) The time needed to visit a tourist attraction can be computed as shown in (8) by summing the travel time from the previous location, the waiting time and the visit duration. (8) The waiting duration, can be computed as shown in (9), where represents the arrival time. (9) Evaluation: All itineraries are evaluated based on the three selected criteria: C1, C2, C3. In order to determine the Pareto-optimal fronts we use an approach similar to the one in the NSGA-II algorithm [23]. Selection: As we have chosen an elitist approach in which we copy all non-dominated solutions from in . We remove all other solutions from and we automatically include all solutions from in . Using mutation and crossover we generate | |solutions that are added to in order to have | | . If | | we use a clustering approach to select solutions as different as possible. Mutation: Adds random variation to the evolving population. Two links are randomly chosen from the itinerary INTERNATIONAL JOURNAL OF COMPUTERS Issue 3, Volume 5, 2011 374
NTERNATIONAL JOURNAL OF COMPUTERS ssue 3. Volume 5. 2011 and all the locations between them that do not belong to the The Business Layer includes the main functionalities of the manually selected set of tourist attractions, POlselected, are system, implemented as semantically annotated web services removed. The itinerary is then completed by randomly [24]. The itinerary web service implements the multi-objective inserting new POls genetic algorithm described in the previous section, the Crossover: Combines the features of two parent recommender web service implements a context-aware chromosomes to form new children by swapping collaborative filtering algorithm and the route finding web corresponding itinerary segments of the parents. At first, one service implements the public transport route finding link offset is selected randomly. The resulting segm re algorithm presented by the authors in [251 swapped and a repair step is performed in order to preserve the The Persistence Layer stores information regarding the manually selected tourist attractions from POlselected, in both available tourist attractions, the public transport network, as itineraries. If necessary the POIs with the lowest Si are well as user information. The POl Data database stores for all removed in order to keep the itineraries valid the attractions: the name, a short and a long description The walking distances can be determined using either photos, the location and the opening hours. The User Data Dijkstra or A* for large graphs. The functionality is also database stores the user information as well as the tourist provided by several libraries such as pg Routing and APl's like attractions ratings used by the recommender algorithm. A Google Maps Directions API separate database is used to store the data required by the public transport route finding algorithm SYSTEM ARCHITECTURI Using External Data Providers, such as external web The proposed solution presented in Fig 4 relies on a multi- services, the system collects weather forecasts, retrieves the tier paradigm for both the server and the client changes in the list of available tourist attractions and updates implementations. Even though a client only architecture would the data for the public transport network. ffer several benefits, such as the possibility to work entirely An Intelligent Agents Module has the purpose of offline, it is not a feasible option for our approach due to the monitoring the users'behavior as well as of notifying them of big amounts of data used both by the semantic search and by any context changes, like weather changes or changes in the the collaborative filtering algorithms. Therefore, we have opening hours of the attractions included in the itinerary chosen a mixed approach in which the resource intensive computations are performed on the server, while the simple MOBILE USER INTERFACE ones are performed directly on the client. Client The reference client implementation for mobile devices implementations rely on a local data persistence management shown in Fig. 5 relies on the latest standard web technologies solution in order to avoid unnecessary server requests. The such as WebsQL and WebStorage to offer portability across Mobile Devices(FeaturePhones, SmartPhones, Tablets) Desktop Comput Devices that support HTML Windows phone了 older/simpler Devices Desktop Applications UL:HTMS,ISCSS3 plementation UI:HIM5JSCSS3 Ul: OS Specific ()1(hm)(数)()1(Pme Aocepts Http Requests Web Servce Calls and Socket Cornectons ntelligent Agents Module External Data Pr Business Layer Itinerary Planning Web Service) Recommender Web Service(Route FindingWeb Service leather Wet Persistence Layer POl Data User Data Public Transport Data Public Transport Web Service Fig. 4 Distributed architect server code is written in Java and uses Apache Tomcat as a different platforms as well as a rich user experience. w3C web application server GeoLocation api was used to determine and monitor the The Interface Layer has the role of facilitating the position of the user communication with the client implementations using Http The application can either be used directly from the requests, Web Service Calls and Socket Connections. Thanks browser, or can be installed using a thin native wrapper that to the multiple communication methods supported by this provides the required translation from JavaScript method calls layer, client side applications can be implemented using a to native method calls. The application works on all devices wide array of technologie that comply with the hTML5 standard specifications
and all the locations between them that do not belong to the manually selected set of tourist attractions, , are removed. The itinerary is then completed by randomly inserting new POIs. Crossover: Combines the features of two parent chromosomes to form new children by swapping corresponding itinerary segments of the parents. At first, one link offset is selected randomly. The resulting segments are swapped and a repair step is performed in order to preserve the manually selected tourist attractions from , in both itineraries. If necessary the POIs with the lowest are removed in order to keep the itineraries valid. The walking distances can be determined using either Dijkstra or A* for large graphs. The functionality is also provided by several libraries such as pgRouting and API’s like Google Maps Directions API. SYSTEM ARCHITECTURE The proposed solution presented in Fig. 4 relies on a multitier paradigm for both the server and the client implementations. Even though a client only architecture would offer several benefits, such as the possibility to work entirely offline, it is not a feasible option for our approach due to the big amounts of data used both by the semantic search and by the collaborative filtering algorithms. Therefore, we have chosen a mixed approach in which the resource intensive computations are performed on the server, while the simple ones are performed directly on the client. Client implementations rely on a local data persistence management solution in order to avoid unnecessary server requests. The server code is written in Java and uses Apache Tomcat as a web application server. The Interface Layer has the role of facilitating the communication with the client implementations using HTTP requests, Web Service Calls and Socket Connections. Thanks to the multiple communication methods supported by this layer, client side applications can be implemented using a wide array of technologies. The Business Layer includes the main functionalities of the system, implemented as semantically annotated web services [24]. The itinerary web service implements the multi-objective genetic algorithm described in the previous section, the recommender web service implements a context-aware collaborative filtering algorithm and the route finding web service implements the public transport route finding algorithm presented by the authors in [25]. The Persistence Layer stores information regarding the available tourist attractions, the public transport network, as well as user information. The POI Data database stores for all the attractions: the name, a short and a long description, photos, the location and the opening hours. The User Data database stores the user information as well as the tourist attractions ratings used by the recommender algorithm. A separate database is used to store the data required by the public transport route finding algorithm. Using External Data Providers, such as external web services, the system collects weather forecasts, retrieves the changes in the list of available tourist attractions and updates the data for the public transport network. An Intelligent Agents Module has the purpose of monitoring the users’ behavior as well as of notifying them of any context changes, like weather changes or changes in the opening hours of the attractions included in the itinerary. MOBILE USER INTERFACE The reference client implementation for mobile devices shown in Fig. 5 relies on the latest standard web technologies such as WebSQL and WebStorage to offer portability across different platforms as well as a rich user experience. W3C GeoLocation API was used to determine and monitor the position of the user. The application can either be used directly from the browser, or can be installed using a thin native wrapper that provides the required translation from JavaScript method calls to native method calls. The application works on all devices that comply with the HTML5 standard specifications Fig. 4 Distributed architecture INTERNATIONAL JOURNAL OF COMPUTERS Issue 3, Volume 5, 2011 375
NTERNATIONAL JOURNAL OF COMPUTERS ssue 3. Volume 5. 2011 including both smartphones and feature phones sourcing approach was implemented in order to determine the The steps that a user should perform in order to build an real visiting duration for the users of the application. For each Itinerary are presented below user, the system constantly updates the difference for each Step /-Create the list of tourist attractions category of Pol between the current user's visiting duration and the average duration. This information is used to bette search for POI as described in this paper or can select from a Public transport route finding is also included in the list of suggested tourist attractions. The list of suggestions is application in order to allow the user to easily travel between generated using a collaborative filtering algorithm that also the locations selected in the itinerary takes into consideration the current context including weather time, week day and season. The tourist attractions can either be added manually to the itinerary, corresponding to the Itinerary POlselected set, or can be added to the favorites list, corresponding to the POlFavorites set that are used as input Start time: 10:00v parameters for the multi-objective itinerary building algorithm. As shown in Fig. 6, users can change the start time Finish time: Ts, the finish time -TF, the start and finish locations -Ls, LE Location: STIBL Online Multiple itineraries ON Itinerary Using only WishList With Auto Fill Fig. 6 Itinerary settings screen Suggestions Where 2Go A comparison between native and mobile web applications built using the latest standard web technologies is provided in Table I. As shown also in the table, developing native versions of an application for several mobile platforms requires more time and increases the costs due to the different development Context Profile approaches characteristic for each platform Table I Comparison between Native and Mobile Web applications Smarter Transportation e Applications based Applications on web technolo Fig. 5 Mobile HTml5 interface ortable across Step 2-Choosing the itinerary Can run in the ne or several itineraries will be generated. If the user chooses i Can be installed to generate multiple itineraries, a number of Nsol possible itineraries are generated on the server using the multi- Technologies the same for all objective itinerary algorithm. The itineraries are displayed on different for platforms he mobile phone including the score for the 3 optimization each platform(HTML5, CSS3 criteria Cl, 2, C3. As shown in the fourth section, the Local persistence system used previously recorded data to predict the visit Possibility to work duration for the current user offline Step 3-Guidance during the tour Advanced hardware features The selected itinerary can be viewed either as a list or using Res the map. In case the tourist exceeds the average duration for esource better worse performance some of the points of interest, the itinerary is dynamically Intensive erformance updated. Moreover, as shown in the fourth section, a crowd
including both smartphones and feature phones. The steps that a user should perform in order to build an itinerary are presented below. Step 1 - Create the list of tourist attractions In order to build an itinerary, the user can both semantically search for POI as described in this paper or can select from a list of suggested tourist attractions. The list of suggestions is generated using a collaborative filtering algorithm that also takes into consideration the current context including weather, time, week day and season. The tourist attractions can either be added manually to the itinerary, corresponding to the set, or can be added to the favorites list, corresponding to the set that are used as input parameters for the multi-objective itinerary building algorithm. As shown in Fig. 6, users can change the start time - , the finish time - , the start and finish locations - , . Fig. 5 Mobile HTML5 interface Step 2 – Choosing the itinerary The itinerary configuration screen also allows choosing if one or several itineraries will be generated. If the user chooses to generate multiple itineraries, a number of possible itineraries are generated on the server using the multiobjective itinerary algorithm. The itineraries are displayed on the mobile phone including the score for the 3 optimization criteria C1, C2, C3. As shown in the fourth section, the system used previously recorded data to predict the visit duration for the current user. Step 3 – Guidance during the tour The selected itinerary can be viewed either as a list or using the map. In case the tourist exceeds the average duration for some of the points of interest, the itinerary is dynamically updated. Moreover, as shown in the fourth section, a crowd sourcing approach was implemented in order to determine the real visiting duration for the users of the application. For each user, the system constantly updates the difference for each category of POI between the current user’s visiting duration and the average duration. This information is used to better predict the visiting durations in future itineraries. Public transport route finding is also included in the application in order to allow the user to easily travel between the locations selected in the itinerary. Fig. 6 Itinerary settings screen A comparison between native and mobile web applications built using the latest standard web technologies is provided in Table I. As shown also in the table, developing native versions of an application for several mobile platforms requires more time and increases the costs due to the different development approaches characteristic for each platform Table I Comparison between Native and Mobile Web Applications Native Applications Applications based on web technologies General Portable across many platforms no yes Can run in the browser no yes Can be installed yes yes Development Technologies different for each platform the same for all platforms (HTML5, CSS3, JavaScript) Local persistence yes yes Possibility to work offline yes yes Advanced hardware features yes yes Resource intensive computations better performance worse performance INTERNATIONAL JOURNAL OF COMPUTERS Issue 3, Volume 5, 2011 376
NTERNATIONAL JOURNAL OF COMPUTERS ssue 3. Volume 5. 2011 Others Information and Communication Technologies in Tourism, 2009, pp Accepted in Application Stores 9R. Kramer, M. Modsching, K. Hagen, and U. Gretzel, "Behavioural impacts of mobile tour guides, Information and Communication Technologies in Tourism, 2007, pp. 109-118 [10] A. Maruyama, N. Shibata, Y. Murata, K. Yasumoto, and M. Ito, " P Mobile web applications can be installed on mobile devices tour: A personal navigation system for tourism, Proc. of I Ith World Congress on ITS, 2004, pp. 18-21 Ising a thin native wrapper. When running directly from the [11] P. Vansteenwegen, W. Souffriau, G. Vanden Berghe, and D.Van browser the local storage is limited to approximately 10Mb terated local search for the team orienteering asily be avoided by using a native erations Research, vol rapper around the web application [12]R. Kramer, M. Modsching, K. Hagen, and U. Gretzel, of mobile tour guides, Information and Co CONCLUSIONS Technologies in Tourism, 2007, P. 109-1l8 [13]C Dang, X WordNet-based Document Summarization" in proc In this paper we presented an itinerary recommender system Tth WSEAs International Conference on Applied Computer Applied that uses both semantic search and collaborative filtering to Computational Science (ACACOS 08), Hangzhou, China, 2008, pp allow users to find interesting Points of Interest. A multi- [141. Smeureanu, A.Diosteanu, C.Delcea, and LA.Cotfas,"Busines objective itinerary hybrid genetic algorithm that allows finding Ontology for Evaluating Corporate Social Responsibility, Amfiteatru routes in near real-time was also presented. The novelty of our Economic, vol. 29, pp. 28-42, 2011 approach consists of the fact that we use neuronal network [15] Taeho, J,"Neural Based Approach to Keyword Extraction from Documents, Computational Science and Its Applications, Vol. 2667 document indexing based on key word extraction and also a Springer Berlin, pp 263-270, 2003 modified version of the multi objective genetic algorithm. This [6K. Sarkar, M. Nasipuri S,, Ghose "A New Approach to keyphrase enables users to easily generate itineraries in a very flexible cience Issues, Vol, 7, Issue 2, No 3, pp 16-25, 2010 and dynamic manner with less effort and input. The [7] O. Medelyan, competitive automatic topic indexing, PhD application takes into considerations user's preferences and Thesis, University of Waikato, New Zeeland, 2009 profile. A crowd sourcing approach was implemented in order [18 The Stanford Natural Language Processing Group,"The Stanford to compare the differences between the estimated and real rser.shtml. [Accessed: January. 15, 2010 times need to travel between the tourist attractions. The [19] GATE,"General Architecture for Text Engine ”201l. Online semantic search approach allows users to find POls related to [20) University of Waikato,"Weka,2011.[Online).Available highly specific information [21/fap://www.cs.waikatoac.nz/ml/weka/.[accessed:January.15,2010] Suchanek, G. Kasneci, and G. Weikum, "Yago: a core of semantic ACKNOWLEDGMENT knowledge, Proceedings of the 16th international conference on world Wide Web, New York, 2007, pp 697-706 L A Cotfas is grateful to Professor Ricci for the very useful [22] M Kenteris, D Gavalas, G. Pantziou, and C. Konstantopoulos, "Near. suggestions and to the free University of Bozen-Bolzano for ptimal personalized daily itineraries for a mobile tourist guide, IEEE Symposium on Computers and Communications, pp. 862-864), 2010 kind hospitality during the time this work was carried out 23]K. Deb, A. Pratap, s. Agarwal, and T Meyarivan,"A fast and elitist [24/L sationary Computation,vol6, pp182-197,2002 Cotfas, A. Diosteanu, and A. Smeureanu, L " Fractal web service REFERENCES framework". 8th International Conference [1 YH Kuo, C S Lee, S-M Guo, and FT Tu, Apply FNN Model to Communications, 2010, pp 405-408) Construct Ontology-based O &A System,"WSEAS Transactions on [25] L.A. Cotfas, M.C. Croicu, and D. Cotfas "A Collaborative GIS Solution Commumications, vol 3, Issue 1, pp 328-335, Jan 2004 [] F. Ricci, L. Rokach, L, and B. Shapira, B. Recommender Systems 2009 Handbook, Springer-Verlag, 2011, chl 3C. Vassiliou, D. Stamoulis, D. Martakos"A Recommender System Framework combining Neural Networks Collaborative Filtering",in IS Int. Conf. on Instrumentation, Measurement ircuits and Systems, Hangzhou, China, 2006, Pp 285-290 [4]HF Wang, C.T. Wu, A Strategy-Oriented Operation Module for Recommender Systems in E Commerce", in Proc of the th WsEAS (AlC 09), Moscow, Russia, 2009, pp. 78-83 []G. Abowd, C. Atkeson, J. Hong, S. Long, R Cyberguide: A mobile context tns are tour guide, vol.3,pp.421-433,1997 [6]K. Cheverst, N. Davies, and K. Mitchell, "The role of adaptive hypermedia in a contexl-cmware tourist guide, Communications of the ACM: Special Issue on Adaptive Web-Based Systems and Adaptive Hypermedia, No 45, 2002, pp. 47-51 [w. Souffriau, P. Vansteenwegen, J. Vertommen, G.V. Berg Oudheusden, "A personalised tourist trip design algorithm tourist guides, Applied Artificial Intelligence No. 2 8A. Garcia, M.T. Linaza, O. Arbela Vansteenwegen Intelligent routing system for a personalised electronic tourist guide 7
Others Accepted in Application Stores yes yes* Easy to update no yes Mobile web applications can be installed on mobile devices using a thin native wrapper. When running directly from the browser the local storage is limited to approximately 10Mb. This limitation can easily be avoided by using a native wrapper around the web application. CONCLUSIONS In this paper we presented an itinerary recommender system that uses both semantic search and collaborative filtering to allow users to find interesting Points of Interest. A multiobjective itinerary hybrid genetic algorithm that allows finding routes in near real-time was also presented. The novelty of our approach consists of the fact that we use neuronal network document indexing based on key word extraction and also a modified version of the multi objective genetic algorithm. This enables users to easily generate itineraries in a very flexible and dynamic manner with less effort and input. The application takes into considerations user’s preferences and profile. A crowd sourcing approach was implemented in order to compare the differences between the estimated and real times need to travel between the tourist attractions. The semantic search approach allows users to find POIs related to highly specific information. ACKNOWLEDGMENT L.A. Cotfas is grateful to Professor Ricci for the very useful suggestions and to the Free University of Bozen-Bolzano for kind hospitality during the time this work was carried out. REFERENCES [1] Y.H Kuo, C.S Lee, S-M Guo, and F.T Tu, “Apply FNN Model to Construct Ontology-based Q&A System,” WSEAS Transactions on Communications, vol. 3, Issue 1, pp. 328-335, Jan. 2004 [2] F. Ricci, L. Rokach, L., and B. Shapira, B. Recommender Systems Handbook, Springer-Verlag, 2011, ch1. [3] C. Vassiliou, D. Stamoulis, D. Martakos "A Recommender System Framework combining Neural Networks & Collaborative Filtering", in Proc. of the 5th WSEAS Int. Conf. on Instrumentation, Measurement, Circuits and Systems, Hangzhou, China, 2006, pp.285-290 [4] H.F Wang, C.T. Wu, "A Strategy-Oriented Operation Module for Recommender Systems in E Commerce",in Proc of the 9th WSEAS International Conference on Applied Informatics and Communications (AIC '09),Moscow, Russia, 2009, pp. 78-83 [5] G. Abowd, C. Atkeson, J. Hong, S. Long, R. Kooper, M. Pikerton, “Cyberguide: A mobile context‐aware tour guide,” Wireless Networks, vol. 3, pp. 421-433, 1997. [6] K. Cheverst, N. Davies, and K. Mitchell, “The role of adaptive hypermedia in a context-aware tourist guide,” Communications of the ACM: Special Issue on Adaptive Web-Based Systems and Adaptive Hypermedia, No. 45, 2002, pp. 47-51. [7] W. Souffriau, P. Vansteenwegen, J. Vertommen, G.V. Berghe, D.V. Oudheusden, “A personalized tourist trip design algorithm for mobile tourist guides,” Applied Artificial Intelligence No. 22, pp. 964-985, 2008. [8] A. Garcia, M.T. Linaza, O. Arbelaitz, and P. Vansteenwegen, “Intelligent routing system for a personalised electronic tourist guide,” Information and Communication Technologies in Tourism, 2009, pp. 185–197. [9] R. Kramer, M. Modsching, K. Hagen, and U. Gretzel, “Behavioural impacts of mobile tour guides,” Information and Communication Technologies in Tourism, 2007, pp. 109–118. [10] A. Maruyama, N. Shibata, Y. Murata, K. Yasumoto, and M. Ito, “Ptour: A personal navigation system for tourism,” Proc. of 11th World Congress on ITS, 2004, pp. 18–21. [11] P. Vansteenwegen, W. Souffriau, G. Vanden Berghe, and D. Van Oudheusden, “Iterated local search for the team orienteering problem with time windows,” Computers & Operations Research, vol. 36, 2009, pp. 3281-3290. [12] R. Kramer, M. Modsching, K. Hagen, and U. Gretzel, “Behavioural impacts of mobile tour guides,” Information and Communication Technologies in Tourism, 2007, p. 109–118. [13] C. Dang, X. Luo,"WordNet-based Document Summarization", in Proc. 7th WSEAS International Conference on Applied Computer & Applied Computational Science (ACACOS '08), Hangzhou, China, 2008, pp. 383-387 [14] I. Smeureanu, A. Diosteanu, C. Delcea, and L.A. Cotfas, “Busines Ontology for Evaluating Corporate Social Responsibility,” Amfiteatru Economic, vol. 29, pp. 28-42, 2011. [15] Taeho, J., “Neural Based Approach to Keyword Extraction from Documents,” Computational Science and Its Applications, Vol. 2667, Springer Berlin, pp 263-270, 2003. [16] K. Sarkar., M. Nasipuri, S. Ghose, “A New Approach to Keyphrase Extraction Using Neural Networks,” International Journal of Computer Science Issues, Vol. 7, Issue 2, No 3, pp 16-25, 2010. [17] O. Medelyan, Human-competitive automatic topic indexing, PhD Thesis, University of Waikato, New Zeeland, 2009. [18] The Stanford Natural Language Processing Group, “The Stanford Parser,” 2011. [Online]. Available: http://nlp.stanford.edu/software/lexparser.shtml . [Accessed: January. 15, 2010]. [19] GATE, “General Architecture for Text Engineering,” 2011. [Online]. Available: http://gate.ac.uk/.[Accessed: January. 15, 2010]. [20] University of Waikato, "Weka," 2011. [Online]. Available: http://www.cs.waikato.ac.nz/ml/weka/. [Accessed: January. 15, 2010]. [21] F.M. Suchanek, G. Kasneci, and G. Weikum, “Yago: a core of semantic knowledge,” Proceedings of the 16th international conference on World Wide Web, New York, 2007, pp. 697-706. [22] M. Kenteris, D. Gavalas, G. Pantziou, and C. Konstantopoulos, “Nearoptimal personalized daily itineraries for a mobile tourist guide,” IEEE Symposium on Computers and Communications, pp. 862–864), 2010. [23] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,” IEEE Transactions on Evolutionary Computation, vol. 6, pp. 182-197, 2002. [24] L.A. Cotfas, A. Dioșteanu, and A. Smeureanu, I. “Fractal web service composition framework”. 8th International Conference on Communications, 2010, pp. 405-408). [25] L.A. Cotfas, M.C. Croicu, and D. Cotfas “A Collaborative GIS Solution for Public Transport”. Informatica Economică, Vol. 13(2), pp. 50-58, 2009 INTERNATIONAL JOURNAL OF COMPUTERS Issue 3, Volume 5, 2011 377