Three Essential Techniques for Similar Documents 1.Shingling:convert documents,emails,etc.,to sets. 2.Minhashing convert large sets to short signatures,while preserving similarity. 3.Locality-sensitive hashing focus on pairs of signatures likely to be similar. 1212 Three Essential Techniques for Similar Documents 1. Shingling : convert documents, emails, etc., to sets. 2. Minhashing : convert large sets to short signatures, while preserving similarity. 3. Locality-sensitive hashing : focus on pairs of signatures likely to be similar. Introduction