In the process of knowledge discovery, workers examine available information in order to make sense of it. By sensemaking, we mean interacting with and operating on the informatio...
Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, a quadratic ...
Timothy de Vries, Hui Ke, Sanjay Chawla, Peter Chr...
The theoretical characterisation of multiword expressions (MWEs) is tightly connected to their actual occurrences in data and to their representation in lexical resources. We pres...
In this paper, we examine a method for feature subset selection based on Information Theory. Initially, a framework for de ning the theoretically optimal, but computationally intr...
Search engines provide a small window to the vast repository of data they index and against which they search. They try their best to return the documents that are of relevance to...