Course Objectives Introduce information retrieval(IR)and Search Engine(SE) Foundation:Basic concepts,principles,methods, etc -Trends:Frontier topics Prepare students to do research in IR and/or related fields CCF-ADL at Zhengzhou University, 2 June25-27,2010
CCF-ADL at Zhengzhou University, June 25-27, 2010 2 Course Objectives • Introduce information retrieval (IR) and Search Engine (SE) – Foundation: Basic concepts, principles, methods, etc – Trends: Frontier topics • Prepare students to do research in IR and/or related fields
Course Management 。Course website: -http://net.pku.edu.cn/~course/2010-ccf-acl/ Course group discussion: -http://groups.google.com.hk/group/cs410pku 。Questions: Post the questions on the group discussion forum CCF-ADL at Zhengzhou University, June25-27,2010
CCF-ADL at Zhengzhou University, June 25-27, 2010 3 Course Management • Course website: – http://net.pku.edu.cn/~course/2010-ccf-acl/ • Course group discussion: – http://groups.google.com.hk/group/cs410pku • Questions: – Post the questions on the group discussion forum
What is Text Info.Management? TIM is concerned with technologies for managing and exploiting text information effectively and efficiently Importance of managing text information The most natural way of encoding knowledge Think about scientific literature -The most common type of information How much textual information do you produce and consume every day? -The most basic form of information It can be used to describe other media of information -The most useful form of information! CCF-ADL at Zhengzhou University June25-27,2010
CCF-ADL at Zhengzhou University, June 25-27, 2010 4 What is Text Info. Management? • TIM is concerned with technologies for managing and exploiting text information effectively and efficiently • Importance of managing text information – The most natural way of encoding knowledge • Think about scientific literature – The most common type of information • How much textual information do you produce and consume every day? – The most basic form of information • It can be used to describe other media of information – The most useful form of information!
Text Management Applications Access Mining Select Create Knowledge information Add Organization Structure/Annotations CCF-ADL at Zhengzhou University,June 25-27,2010
CCF-ADL at Zhengzhou University, June 25-27, 2010 5 Text Management Applications Access Mining Organization Select information Create Knowledge Add Structure/Annotations
Examples of Text Management Applications 。Search Web search engines (Google,Yahoo,...) -Library systems ·Recommendation News filter Literature/movie recommender ● Categorization Automatically sorting emails ● Mining/Extraction Discovering major complaints from email in customer service Business intelligence Bioinformatics 。 Many.others.… CCF-ADL at Zhe 6 June25-27,2010
CCF-ADL at Zhengzhou University, June 25-27, 2010 6 Examples of Text Management Applications • Search – Web search engines (Google, Yahoo, …) – Library systems • Recommendation – News filter – Literature/movie recommender • Categorization – Automatically sorting emails • Mining/Extraction – Discovering major complaints from email in customer service – Business intelligence – Bioinformatics • Many others…
Elements of Text Info Management Technologies Retrieval Summarization Visualization Mining Applications Applications Filtering Mining Information Information Knowledge Access Search Organization Extraction Acquisition Categorization Clustering Focus of the course Natural Language Content Analysis Text CCF-ADL at Zhengzhou University,June 25-27,2010
CCF-ADL at Zhengzhou University, June 25-27, 2010 7 Elements of Text Info Management Technologies Search Text Filtering Categorization Summarization Clustering Natural Language Content Analysis Extraction Mining Retrieval Visualization Applications Mining Applications Information Access Knowledge Acquisition Information Organization Focus of the course
Text Management and Other Areas User Human-computer interaction Information Science Software engineering TM Applications Web Probabilistic inference Machine learning TM Algorithms Natural language processing Computer science Storage Compression Text CCF-ADL at Zhengzhou University,June 25-27,2010
CCF-ADL at Zhengzhou University, June 25-27, 2010 8 Text Management and Other Areas TM Algorithms User Text Storage Compression Probabilistic inference Machine learning Natural language processing Human-computer interaction Software engineering TM Applications Web Computer science Information Science
Related Areas Applications I Models Applications Web,Bioinformatics... Machine Learning Pattern Recognition Data Mining Info Science Statistics Information Optimization I Retrieval Natural Databases Language Processing Software engineering I Computer systems Algorithms Systems CCF-ADL at Zhengzhou University,June 25-27,2010
CCF-ADL at Zhengzhou University, June 25-27, 2010 9 Related Areas Information Retrieval Databases Info Science Machine Learning Pattern Recognition Data Mining Natural Language Processing Applications Web, Bioinformatics… Statistics Optimization Software engineering Computer systems Models Algorithms Applications Systems
Publications/Societies(Incomplete) Learning/Mining Applications ICML WWW WSDM ICML,NIPS,UAI RECOMB,PSB Info.Science ACM SIGKDD Statistics Info Retrieval JCDL ASIS AAAI ACM SIGIR HLT NLP ACL ACM CIKM TREC Databases ACM SIGMOD COLING,EMNLP,ANLP SOSP VLDB,PODS,ICDE Software/systems OSDI CCF-ADL at Zhengzhou University,June 25-27,2010 10
CCF-ADL at Zhengzhou University, June 25-27, 2010 10 Publications/Societies (Incomplete) ACM SIGIR VLDB, PODS, ICDE ASIS Learning/Mining NLP Applications Statistics Software/systems COLING, EMNLP, ANLP HLT ICML, NIPS, UAI RECOMB, PSB JCDL Info. Science Info Retrieval ACM CIKM Databases ACM SIGMOD ACL ICML AAAI ACM SIGKDD WWW SOSP OSDI TREC WSDM
Where to Publish IR Papers 。Core IR conferences: ACM SIGIR,ACM CIKM -ECIR,AIRS ·Core IR journals ACM TOIS,IRJ IPM,JASIS ·Web Applications WWW,WSDM Other related conferences Natural Language Processing:HLT,ACL,NAACL,COLING,EMNLP Machine Learning:ICML,NIPS Data Mining:KDD,ICDM -Databases:SIGMOD,VLDB,ICDE ●… CCF-ADL at Zhengzhou University, 11 June25-27,2010
CCF-ADL at Zhengzhou University, June 25-27, 2010 11 Where to Publish IR Papers • Core IR conferences: – ACM SIGIR, ACM CIKM – ECIR, AIRS • Core IR journals – ACM TOIS, IRJ – IPM, JASIS • Web Applications – WWW, WSDM • Other related conferences – Natural Language Processing: HLT, ACL, NAACL, COLING, EMNLP – Machine Learning: ICML, NIPS – Data Mining: KDD, ICDM – Databases: SIGMOD, VLDB, ICDE • …