正在加载图片...
WHIRL Separate each string into words and each word w is assigned a weight vo(w)=log(tf, +1). log(idf w For example aT&T or"IBM will have higher weights Inc" will have higher weights The cosine similarity of sl and s2 is defined as 11 s2m(g1,0? For example John Smith" Mr John smith"would have similarity close to one Problem Compter Science department and" Deprtment of Computer Scence? wil have zero similarityWHIRL ◼ Separate each string into words and each word w is assigned a weight: ◼ For example: “AT&T” or “IBM” will have higher weights “Inc” will have higher weights ◼ The cosine similarity of s1 and s2 is defined as ◼ For example: “John Smith” and “Mr. John Smith” would have similarity close to one. ◼ Problem: “Compter Science Department” and “Deprtment of Computer Scence” will have zero similarity
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有