Web pages are usually highly structured documents. In some documents, content with different functionality is laid out in blocks, some merely supporting the main discourse. In ot...
Browsing and searching for documents in large, online enterprise document repositories are common activities. While internet search produces satisfying results for most user queri...
Andreas Girgensohn, Frank M. Shipman III, Francine...
Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Index...
Regulations require businesses to archive many electronic documents for extended periods of time. Given the sheer volume of documents and the response time requirements, documents...
Soumyadeb Mitra, Marianne Winslett, Windsor W. Hsu
The primary function of current Web search engines is essentially relevance ranking at the document level. However, myriad structured information about real-world objects is embed...
Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen,...