Identification of all objects in a dataset whose similarity is not less than a specified threshold is of major importance for management, search, and analysis of data. Set similari...
The problem of combining the ranked preferences of many experts is an old and surprisingly deep problem that has gained renewed importance in many machine learning, data mining, a...
Determining similarities among multimedia objects is a fundamental task in many content-based retrieval, analysis, mining, and exploration applications. Among state-of-the-art sim...
Audio-based music similarity measures can be applied to automatically generate playlists or recommendations. In this paper spectral similarity is combined with complementary infor...
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz