Given a body of documents,e.g.,the Web,find pairs of documents with a lot of text in common, e.g.: -Mirror sites,or approximate mirrors. Application:Don't want to show both in a search -Plagiarism,including large quotations. -Similar news articles at many news sites. Application:Cluster articles by "same story." 1111 • Given a body of documents, e.g., the Web, find pairs of documents with a lot of text in common, e.g.: – Mirror sites, or approximate mirrors. • Application: Don’t want to show both in a search. – Plagiarism, including large quotations. – Similar news articles at many news sites. • Application: Cluster articles by “same story.” Introduction