While the vast majority of clustering algorithms are partitional, many real world datasets have inherently overlapping clusters. Several approaches to finding overlapping clusters...
We present an unusual algorithm involving classification trees-CARTwheels--where two trees are grown in opposite directions so that they are joined at their leaves. This approach ...
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
Classification is a well-established operation in text mining. Given a set of labels A and a set DA of training documents tagged with these labels, a classifier learns to assign l...
Recent HCI research shows a strong interest in task management systems (e.g. [19, 27]) that support the multitasked nature of information work [13]. These systems either require m...