Similarity Based Retrieval Similarity based retrieval-retrieve documents similar to a given document Similarity may be defined on the basis of common words E.g.,find k terms in A with highest TF(d,t)/n(t)and use these terms to find relevance of other documents Relevance feedback:Similarity can be used to refine answer set to keyword query User selects a few relevant documents from those retrieved by keyword query,and system finds other documents similar to these Vector space model:define an n-dimensional space,where n is the number of words in the document set. Vector for document d goes from origin to a point whose ith coordinate is TF(d,t)/n(t) The cosine of the angle between the vectors of two documents is used as a measure of their similarity Database System Concepts-6th Edition 21.8 @Silberschatz,Korth and SudarshanDatabase System Concepts - 6 21.8 ©Silberschatz, Korth and Sudarshan th Edition Similarity Based Retrieval Similarity based retrieval - retrieve documents similar to a given document Similarity may be defined on the basis of common words E.g., find k terms in A with highest TF (d, t ) / n (t ) and use these terms to find relevance of other documents. Relevance feedback: Similarity can be used to refine answer set to keyword query User selects a few relevant documents from those retrieved by keyword query, and system finds other documents similar to these Vector space model: define an n-dimensional space, where n is the number of words in the document set. Vector for document d goes from origin to a point whose i th coordinate is TF (d,t ) / n (t ) The cosine of the angle between the vectors of two documents is used as a measure of their similarity