Sciweavers

402 search results - page 79 / 81
» Exploring Digital Libraries with Document Image Retrieval
Sort
View
CIKM
2008
Springer
13 years 7 months ago
Efficient and effective link analysis with precomputed salsa maps
SALSA is a link-based ranking algorithm that takes the result set of a query as input, extends the set to include additional neighboring documents in the web graph, and performs a...
Marc Najork, Nick Craswell
SIGIR
2003
ACM
13 years 11 months ago
Domain-independent text segmentation using anisotropic diffusion and dynamic programming
This paper presents a novel domain-independent text segmentation method, which identifies the boundaries of topic changes in long text documents and/or text streams. The method c...
Xiang Ji, Hongyuan Zha
JCDL
2005
ACM
161views Education» more  JCDL 2005»
13 years 11 months ago
Downloading textual hidden web content through keyword queries
An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to acc...
Alexandros Ntoulas, Petros Zerfos, Junghoo Cho
ELPUB
2007
ACM
13 years 9 months ago
Towards an Ontology of ElPub/SciX: A Proposal
A proposal is presented for a standard ontology language defined as ElPub/SciX Ontology, based on the content of a web digital library of conference proceedings. This content, i.e...
Sely Maria de Souza Costa, Cláudio Gottscha...
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
14 years 6 months ago
Probabilistic author-topic models for information discovery
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...