Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
In this paper, we present a new hypergraph partitioning algorithm that is based on the multilevel paradigm. In the multilevel paradigm, a sequence of successively coarser hypergra...
George Karypis, Rajat Aggarwal, Vipin Kumar, Shash...
Data warehouses and recording systems typically have a large continuous stream of incoming data, that must be stored in a manner suitable for future access. Access to stored recor...
H. V. Jagadish, P. P. S. Narayan, S. Seshadri, S. ...
Abstract. The vision of semantic business processes is to enable the integration and inter-operability of business processes across organizational boundaries. Since different orga...
Christoph Kiefer, Abraham Bernstein, Hong Joo Lee,...
Abstract. Latent Semantic Indexing(LSI) has been proved to be effective to capture the semantic structure of document collections. It is widely used in content-based text retrieval...