Abstract. Existing methods to text plagiarism analysis mainly base on “chunking”, a process of grouping a text into meaningful units each of which gets encoded by an integer nu...
Similarity search and data mining often rely on distance or similarity functions in order to provide meaningful results and semantically meaningful patterns. However, standard dist...
Tobias Emrich, Franz Graf, Hans-Peter Kriegel, Mat...
This paper presents two metrics for the Nearest Neighbor Classifier that share the property of being adapted, i.e. learned, on a set of data. Both metrics can be used for similari...
Self-organizing maps (SOMs) are widely used in several fields of application, from neurobiology to multivariate data analysis. In that context, this paper presents variants of the...
High-dimensional indexing has been very popularly used for performing similarity search over various data types such as multimedia (audio/image/video) databases, document collectio...
Rahul Malik, Sangkyum Kim, Xin Jin, Chandrasekar R...