Structure analysis of table form document is important because printed documents and also electronical documents only provide geometrical layout and lexical information explicitly...
In this paper, we propose a novel document clustering method based on the non-negative factorization of the termdocument matrix of the given document corpus. In the latent semanti...
As the world wide web transforms from a vehicle of information dissemination and e-commerce transactions into a writable nexus of human collaboration, the Web 2.0 technologies at ...
This paper extends previous work on document retrieval and document type classification, addressing the problem of ‘typed search’. Specifically, given a query and a designated ...
Jun Xu, Yunbo Cao, Hang Li, Nick Craswell, Yalou H...
We present a user-centred, task-oriented, comparative evaluation of two query-based document skimming tools. ProfileSkim bases within-document retrieval on computing a relevance pr...