Sciweavers

924 search results - page 4 / 185
» Measuring Information Understanding in Large Document Collec...
Sort
View
ECIR
2007
Springer
14 years 11 months ago
Entropy-Based Authorship Search in Large Document Collections
The purpose of authorship search is to identify documents written by a particular author or in a particular style in large document collections. Standard search engines match docum...
Ying Zhao, Justin Zobel
94
Voted
IJDAR
2002
108views more  IJDAR 2002»
14 years 9 months ago
Document understanding for a broad class of documents
We present a document analysis system able to assign logical labels and extract the reading order in a broad set of documents. All information sources, from geometric features and ...
Marco Aiello, Christof Monz, Leon Todoran
CIKM
2000
Springer
15 years 2 months ago
A Semi-Supervised Document Clustering Technique for Information Organization
This paper discusses a new type of semi-supervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes docum...
Han-joon Kim, Sang-goo Lee
84
Voted
KDD
2007
ACM
136views Data Mining» more  KDD 2007»
15 years 10 months ago
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases
We now have incrementally-grown databases of text documents ranging back for over a decade in areas ranging from personal email, to news-articles and conference proceedings. While...
Benyah Shaparenko, Thorsten Joachims
WISE
2005
Springer
15 years 3 months ago
Document Re-ranking by Generality in Bio-medical Information Retrieval
Document ranking is well known to be a crucial process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. ...
Xin Yan, Xue Li, Dawei Song