Sciweavers

1319 search results - page 161 / 264
» Using the Structure of HTML Documents to Improve Retrieval
Sort
View
WWW
2009
ACM
16 years 4 months ago
Extracting article text from the web with maximum subsequence segmentation
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
Jeff Pasternack, Dan Roth
KDD
2008
ACM
178views Data Mining» more  KDD 2008»
16 years 4 months ago
Training structural svms with kernels using sampled cuts
Discriminative training for structured outputs has found increasing applications in areas such as natural language processing, bioinformatics, information retrieval, and computer ...
Chun-Nam John Yu, Thorsten Joachims
133
Voted
WWW
2010
ACM
15 years 10 months ago
Sampling high-quality clicks from noisy click data
Click data captures many users’ document preferences for a query and has been shown to help significantly improve search engine ranking. However, most click data is noisy and of...
Adish Singla, Ryen W. White
117
Voted
CORIA
2010
15 years 5 months ago
Impact de l'information visuelle pour la Recherche d'Images par le contenu et le contexte
Multimedia documents are increasingly used which involve to develop model to that kind of data. In this paper we present a multimedia model which combines textual and visual inform...
Christophe Moulin, Christine Largeron, Mathias G&e...
130
Voted
SIGIR
2008
ACM
15 years 3 months ago
Knowledge transformation from word space to document space
In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many real...
Tao Li, Chris H. Q. Ding, Yi Zhang 0005, Bo Shao