Sciweavers

367 search results - page 66 / 74
» Indexing Text Documents Based on Topic Identification
Sort
View
VLDB
2002
ACM
161views Database» more  VLDB 2002»
14 years 9 months ago
Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection
Many valuable text databases on the web have non-crawlable contents that are "hidden" behind search interfaces. Metasearchers are helpful tools for searching over many s...
Panagiotis G. Ipeirotis, Luis Gravano
SIGIR
2006
ACM
15 years 3 months ago
Distributed query sampling: a quality-conscious approach
We present an adaptive distributed query-sampling framework that is quality-conscious for extracting high-quality text database samples. The framework divides the query-based samp...
James Caverlee, Ling Liu, Joonsoo Bae
WWW
2006
ACM
15 years 10 months ago
Finding advertising keywords on web pages
A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of rev...
Wen-tau Yih, Joshua Goodman, Vitor R. Carvalho
EWMF
2005
Springer
15 years 3 months ago
Discovering a Term Taxonomy from Term Similarities Using Principal Component Analysis
Abstract. We show that eigenvector decomposition can be used to extract a term taxonomy from a given collection of text documents. So far, methods based on eigenvector decompositio...
Holger Bast, Georges Dupret, Debapriyo Majumdar, B...
IR
2010
14 years 8 months ago
Learning to rank with (a lot of) word features
In this article we present Supervised Semantic Indexing (SSI) which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word...
Bing Bai, Jason Weston, David Grangier, Ronan Coll...