The purpose of authorship search is to identify documents written by a particular author or in a particular style in large document collections. Standard search engines match docum...
We present a document analysis system able to assign logical labels and extract the reading order in a broad set of documents. All information sources, from geometric features and ...
This paper discusses a new type of semi-supervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes docum...
We now have incrementally-grown databases of text documents ranging back for over a decade in areas ranging from personal email, to news-articles and conference proceedings. While...
Document ranking is well known to be a crucial process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. ...