Language Modeling (LM) has been successfully applied to Information Retrieval (IR). However, most of the existing LM approaches only rely on term occurrences in documents, queries...
Jing Bai, Dawei Song, Peter Bruza, Jian-Yun Nie, G...
The detection of new information in a document stream is an important component of many potential applications. In this work, a new novelty detection approach based on the identif...
Abstract. Content-oriented XML retrieval systems support access to XML repositories by retrieving, in response to user queries, XML document components (XML elements) instead of wh...
Abstract. Spectral co-clustering is a generic method of computing coclusters of relational data, such as sets of documents and their terms. Latent semantic analysis is a method of ...
Laurence A. F. Park, Christopher Leckie, Kotagiri ...
This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...