Sciweavers

1486 search results - page 241 / 298
» A Document as a Small World
Sort
View
KDD
2005
ACM
118views Data Mining» more  KDD 2005»
16 years 7 days ago
On the use of linear programming for unsupervised text classification
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Mark Sandler
EDBT
2006
ACM
137views Database» more  EDBT 2006»
15 years 12 months ago
IQN Routing: Integrating Quality and Novelty in P2P Querying and Ranking
Abstract. We consider a collaboration of peers autonomously crawling the Web. A pivotal issue when designing a peer-to-peer (P2P) Web search engine in this environment is query rou...
Sebastian Michel, Matthias Bender, Peter Triantafi...
WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
15 years 9 months ago
Boilerplate Detection using Shallow Text Features
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
Christian Kohlschütter, Peter Fankhauser, Wol...
ESA
2009
Springer
149views Algorithms» more  ESA 2009»
15 years 6 months ago
Sparse Cut Projections in Graph Streams
Finding sparse cuts is an important tool for analyzing large graphs that arise in practice, such as the web graph, online social communities, and VLSI circuits. When dealing with s...
Atish Das Sarma, Sreenivas Gollapudi, Rina Panigra...
SIGIR
2009
ACM
15 years 6 months ago
Reducing long queries using query quality predictors
Long queries frequently contain many extraneous terms that hinder retrieval of relevant documents. We present techniques to reduce long queries to more effective shorter ones tha...
Giridhar Kumaran, Vitor R. Carvalho