Sciweavers

1715 search results - page 99 / 343
» Document Retrieval using a Probabilistic Knowledge Model
Sort
View
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
15 years 4 months ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha
CIKM
2006
Springer
15 years 1 months ago
A document-centric approach to static index pruning in text retrieval systems
We present a static index pruning method, to be used in ad-hoc document retrieval tasks, that follows a documentcentric approach to decide whether a posting for a given term shoul...
Stefan Büttcher, Charles L. A. Clarke
JMMA
2010
162views more  JMMA 2010»
14 years 4 months ago
Co-clustering Documents and Words by Minimizing the Normalized Cut Objective Function
This paper follows a word-document co-clustering model independently introduced in 2001 by several authors such as I.S. Dhillon, H. Zha and C. Ding. This model consists in creatin...
Charles-Edmond Bichot
ECIR
2009
Springer
15 years 7 months ago
Integrating Proximity to Subjective Sentences for Blog Opinion Retrieval
Opinion finding is a challenging retrieval task, where it has been shown that it is especially difficult to improve over a strongly performing topic-relevance baseline. In this pa...
Rodrygo L. T. Santos, Ben He, Craig Macdonald, Iad...
ICDAR
1999
IEEE
15 years 2 months ago
Document Image Layout Comparison and Classification
This paper describes features and methods for document image comparison and classification at the spatial layout level. The methods are useful for visual similarity based document...
Jianying Hu, Ramanujan S. Kashi, Gordon T. Wilfong