正在加载图片...
Expert Systems with Applications 37(2010)7182-7188 Contents lists available at Science Direct Expert Systems with Applications ELSEVIER journalhomepagewww.elsevier.com/locate/eswa Mining ideas from textual information Dirk Thorleuchter a, Dirk van den poel b. Anita Prinze b c hofer INT, D-53879 Euskirchen, Appelsgarten 2, Germany ent Universiry, B-9000 Gent, Tweekerkenstraat 2, Belgium Visiting Researcher, Manchester Business School, The University of Manchester, Manchester M15-6PB, Booth Street West, UK ARTICLE INFO A BSTRACT eywords This approach introduces idea mining as process of extracting new and useful ideas from unstructured text. We use an idea definition from technique philosophy and we focus on ideas that can be used to solve Text classification , The rationale for the idea mining approach is taken over from psychology and cognitive science and follows how persons create ideas. To realize the processing, we use methods from text mining and text classification(tokenization, term filtering methods, Euclidean distance measure etc. ) and combine them ith a new heuristic measure for mining ideas As a result, the idea mining approach tomatically new and useful ideas from an user given text. We present these problem solution ing. This approach is evaluated with pat and it is realized as a web-based application, named Technological ldea Miner' that can be used for further testing and evaluation. G 2010 Elsevier Ltd. All rights reserved. 1 Introduction with how persons create ideas especially for problem solution. Therefore, in Section 2 we focus on a general process of creating 11. Overview problem solution ideas and use it as rationale for the idea mining approach. An idea is an image existing or formed in the mind but it can be In recent years, data and text mining techniques explore and written down as textual information. In the last years, we see a analyze huge amounts of available textual data( Coussement continually increasing amount of information. About 80% off all Van den Poel, 2009). Idea mining uses known methods from these this information is stored in textual form(Gentsch Hanlein, techniques and combine them with a new method to create text 999). Examples are research papers, articles in technical periodi- patterns and a new heuristic measure for mining ideas to realize cals, reports, documents, web pages etc. These texts possibly con- the rationale. Therefore, we present the processing of the idea min tain many new ideas. A new idea is often needed to discover ing approach in Section 3 and we introduce this new idea mining unconventional approaches e.g. to create a technological break- measure in Section 4. through. However, a manual extraction of new ideas from these A further task of idea mining is to present the extracted ideas in asses of texts is time consuming and costly Therefore, it is useful a comprehensible way to the user. Therefore, we focus on results of to search for new problem solution ideas automatically comprehensibility research and their relations to our task(see Sec Text mining or knowledge discovery from texts refers generally tion 5). Additionally, we provide an extensive evaluation to show to the process of extracting interesting information and knowledge the success of the idea mining approach and specifically the heuris- from unstructured text(Hotho Nurnberger, Paal, 2005). Refer- tic idea mining measure(see Section 7). ring to this, we introduce idea mining as an automatically process of extracting new and useful ideas from unstructured text using 1. 2. ldea definition Creating ideas is a well-known topic that is related to psychol- ogy and cognitive science. There, we find many approaches dealing two reasons. Firstly, the technological language is much more stan- dardized than the colloquial language(Hoffmann, Kalverkamper Corresponding author at: Fraunhofer INT, Appelsgarten 2, 53879 Euskirchen, Wiegand, 1998: Martin-Bautista, Sanches, Serrano, vila, 2004). 11838305. Therefore, we get better results by analyzing technological texts E-mail address: Dirk. Thorleuchter@int fraunhofer de(d. thorleuchter. with text mining approaches. Secondly our idea definition is taken PhD Candidate, Ghent University. Belgium. over from technique philosophy( rohpohl, 1996). There, an idea is 0957-4174 front matter o 2010 Elsevier Ltd. All rights reserved. oi:10.1016/eswa201004.013Mining ideas from textual information Dirk Thorleuchter a,*,1, Dirk Van den Poel b , Anita Prinzie b,c a Fraunhofer INT, D-53879 Euskirchen, Appelsgarten 2, Germany bGhent University, B-9000 Gent, Tweekerkenstraat 2, Belgium c Visiting Researcher, Manchester Business School, The University of Manchester, Manchester M15-6PB, Booth Street West, UK article info Keywords: Idea mining Text mining Text classification Technology abstract This approach introduces idea mining as process of extracting new and useful ideas from unstructured text. We use an idea definition from technique philosophy and we focus on ideas that can be used to solve technological problems. The rationale for the idea mining approach is taken over from psychology and cognitive science and follows how persons create ideas. To realize the processing, we use methods from text mining and text classification (tokenization, term filtering methods, Euclidean distance measure etc.) and combine them with a new heuristic measure for mining ideas. As a result, the idea mining approach extracts automatically new and useful ideas from an user given text. We present these problem solution ideas in a comprehensible way to support users in problem solv￾ing. This approach is evaluated with patent data and it is realized as a web-based application, named ‘Technological Idea Miner’ that can be used for further testing and evaluation. 2010 Elsevier Ltd. All rights reserved. 1. Introduction 1.1. Overview An idea is an image existing or formed in the mind but it can be written down as textual information. In the last years, we see a continually increasing amount of information. About 80% off all this information is stored in textual form (Gentsch & Hänlein, 1999). Examples are research papers, articles in technical periodi￾cals, reports, documents, web pages etc. These texts possibly con￾tain many new ideas. A new idea is often needed to discover unconventional approaches e.g. to create a technological break￾through. However, a manual extraction of new ideas from these masses of texts is time consuming and costly. Therefore, it is useful to search for new problem solution ideas automatically. Text mining or knowledge discovery from texts refers generally to the process of extracting interesting information and knowledge from unstructured text (Hotho, Nürnberger, & Paaß, 2005). Refer￾ring to this, we introduce idea mining as an automatically process of extracting new and useful ideas from unstructured text using text mining methods. Creating ideas is a well-known topic that is related to psychol￾ogy and cognitive science. There, we find many approaches dealing with how persons create ideas especially for problem solution. Therefore, in Section 2 we focus on a general process of creating problem solution ideas and use it as rationale for the idea mining approach. In recent years, data and text mining techniques explore and analyze huge amounts of available textual data (Coussement & Van den Poel, 2009). Idea mining uses known methods from these techniques and combine them with a new method to create text patterns and a new heuristic measure for mining ideas to realize the rationale. Therefore, we present the processing of the idea min￾ing approach in Section 3 and we introduce this new idea mining measure in Section 4. A further task of idea mining is to present the extracted ideas in a comprehensible way to the user. Therefore, we focus on results of comprehensibility research and their relations to our task (see Sec￾tion 5). Additionally, we provide an extensive evaluation to show the success of the idea mining approach and specifically the heuris￾tic idea mining measure (see Section 7). 1.2. Idea definition We limit our approach to the technological language because of two reasons. Firstly, the technological language is much more stan￾dardized than the colloquial language (Hoffmann, Kalverkämper, & Wiegand, 1998; Martin-Bautista, Sanches, Serrano, & Vila, 2004). Therefore, we get better results by analyzing technological texts with text mining approaches. Secondly, our idea definition is taken over from technique philosophy (Rohpohl, 1996). There, an idea is 0957-4174/$ - see front matter 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2010.04.013 * Corresponding author at: Fraunhofer INT, Appelsgarten 2, 53879 Euskirchen, Germany. Tel.: +49 2251 18305; fax: +49 2251 18 38 305. E-mail address: Dirk.Thorleuchter@int.fraunhofer.de (D. Thorleuchter). 1 PhD Candidate, Ghent University, Belgium. Expert Systems with Applications 37 (2010) 7182–7188 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有