当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

Introduction to Text Mining 文本挖掘

资源类别:文库,文档格式:PPTX,文档页数:98,文件大小:4.28MB,团购合买
点击下载完整版文档(PPTX)

Introduction to text Mining Thanks for Hongning Wang@UVas slides on Text Ming Courses, Slides are slightly modified by Lei chen

Introduction to Text Mining Thanks for Hongning Wang@UVa’s slides on Text Ming Courses, Slides are slightly modified by Lei Chen

What is"Text Mining"? Text mining also referred to as text data mining roughly equivalent to text analytics, refers to the process of deriving high-quality in formation from text. -wikipedia Another way to view text data mining is as a process of exploratory data analysis that leads to heretofore unknown information, or to answers for questions for which the answer is not currently known. -Hearst, 1999 CSoUVa CS6501: Text Mining

What is “Text Mining”? • “Text mining, also referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text.” - wikipedia • “Another way to view text data mining is as a process of exploratory data analysis that leads to heretofore unknown information, or to answers for questions for which the answer is not currently known.” - Hearst, 1999 CS@UVa CS6501: Text Mining 2

Two different definitions of mining Goal-oriented (effectiveness driven) Any process that generates useful results that are non- obvious is called"mining Keywords: useful+ non-obvious Data isnt necessarily massive Method-oriented (efficiency driven) Any process that involves extracting information from massive data is called"mining Keywords: "massive"+"pattern Patterns aren' t necessarily useful CSoUVa CS6501: Text Mining

Two different definitions of mining • Goal-oriented (effectiveness driven) – Any process that generates useful results that are non￾obvious is called “mining”. – Keywords: “useful” + “non-obvious” – Data isn’t necessarily massive • Method-oriented (efficiency driven) – Any process that involves extracting information from massive data is called “mining” – Keywords: “massive” + “pattern” – Patterns aren’t necessarily useful CS@UVa CS6501: Text Mining 3

Text mining around us Sentiment analysis 20 12 RAC E FO R ∥,GMh,%N 5 uT sonERa WE COLLECT 70,000 H Mn Romney wL。。 “心心 WIN SENTIMEN N THESE I。ufcE TO THE DAY ULFORL 3 CNNPOUTICALTICKER-.COMBLOGSGingrichstepsupsupportrRomney,predictMourdockwihsinindiana. CSoUVa CS6501: Text Mining

Text mining around us • Sentiment analysis CS@UVa CS6501: Text Mining 4

Text mining around us Document summarization efficiently m 「0c8 至 wledge technologies a ng otes u u make il ach Tie vision assets CSoUVa CS6501: Text Mining

Text mining around us • Document summarization CS@UVa CS6501: Text Mining 5

Text mining around us Restaurant/hotel recommendation Bodo's Bagels Hilton Times Square a Price Finder Hilton bleb 口= EXCI Anel any octan时 Daces ceea lose的 eaf of danehy ①63m Book on Ctrpdvaor Recommended Reviews 4.919 Reviews from our TripAdvisor Community CSoUVa CS6501: Text Mining

Text mining around us • Restaurant/hotel recommendation CS@UVa CS6501: Text Mining 6

Text mining around us Text analytics in financial services JUNE 6 MAY 18 Stock price JUNE 22 AUGUST 17 Facebook IPO settles at $25 Stock price peaks JULY 31 Facebook sentiment is atS33 Sentiment drops almost neutra shares reach a new low 52.1 pts previous lows MAY 25 of S19 Sentiment JULY 19 shortly sets a new followed low of 22 by stock DCM Facebook Sentiment Facebook Stock Price CSoUVa CS6501: Text Mining

Text mining around us • Text analytics in financial services CS@UVa CS6501: Text Mining 7

How to perform text mining? As computer scientists, we view it as Text Mining Data Mining t Text Data CSoUVa CS6501: Text Mining 8

How to perform text mining? • As computer scientists, we view it as – Text Mining = Data Mining + Text Data CS@UVa CS6501: Text Mining 8

Text mining v.S. NLP IR, DM How does it relate to data mining in general? How does it relate to computational linguistics? How does it relate to information retrieval? Finding Patterns Finding“ Nuggets” Novel Non-Novel General Non-textual data Database data-mining」 Exploratory queres Textual data Comp Text Mining ′s|S Information Ling retrieval CSoUVa CS6501: Text Mining

Text mining v.s. NLP, IR, DM… • How does it relate to data mining in general? • How does it relate to computational linguistics? • How does it relate to information retrieval? Finding Patterns Finding “Nuggets” Novel Non-Novel Non-textual data General data-mining Exploratory data analysis Database queries Textual data Computational Linguistics Information Text Mining retrieval CS@UVa CS6501: Text Mining 9

ext mining in genera Access Serve for Ir Sub-area of applications DM research Mining Filter Discover knowledge information Based on NLP/ML Add techniques organization Structure/Annotations CSoUVa CS6501: Text Mining

Text mining in general CS@UVa CS6501: Text Mining 10 Access Mining Organization Filter information Discover knowledge Add Structure/Annotations Serve for IR applications Based on NLP/ML techniques Sub-area of DM research

点击下载完整版文档(PPTX)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共98页,可试读20页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有