Filter-and-refine methods ● Basic idea Filter a large number of dissimilar string pairs Verify the remaining potentially similar pairs Drawbacks o Need to tune parameters Bad for short strings 20/2021 PassJoin a VLDB2012Filter-and-refine Methods Basic idea Filter a large number of dissimilar string pairs Verify the remaining potentially similar pairs Drawbacks Need to tune parameters Bad for short strings 1/29/2021 PassJoin @ VLDB2012 10