Sciweavers

8795 search results - page 216 / 1759
» Measuring Generality of Documents
Sort
View
SIGIR
2008
ACM
15 years 4 months ago
Pagerank based clustering of hypertext document collections
Clustering hypertext document collection is an important task in Information Retrieval. Most clustering methods are based on document content and do not take into account the hype...
Konstantin Avrachenkov, Vladimir Dobrynin, Danil N...
KDD
2002
ACM
148views Data Mining» more  KDD 2002»
16 years 5 months ago
Discovering informative content blocks from Web documents
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Shian-Hua Lin, Jan-Ming Ho
WWW
2005
ACM
16 years 5 months ago
Multichannel publication of interactive media documents in a news environment
Multichannel publication of multimedia presentations poses a significant challenge on the generic description of the presentation content and the system necessary to convert these...
Tom Beckers, Nico Oorts, Filip Hendrickx, Rik Van ...
SIGIR
2009
ACM
15 years 11 months ago
Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce
This paper explores the problem of computing pairwise similarity on document collections, focusing on the application of “more like this” queries in the life sciences domain. ...
Jimmy J. Lin
ICIP
2008
IEEE
15 years 11 months ago
MAP-MRF approach for binarization of degraded document image
We propose an algorithm for the binarization of document images degraded by uneven light distribution, based on the Markov Random Field modeling with Maximum A Posteriori probabil...
Jung Gap Kuk, Nam Ik Cho, Kyoung Mu Lee