Search Sciweavers | Sciweavers

385 search results - page 10 / 77

» A language for manipulating clustered web documents results

117

click to vote

EMNLP
2009

130views Natural Language Processing» more EMNLP 2009»

Multilingual Spectral Clustering Using Document Similarity Propagation

14 years 11 months ago

Download www.aclweb.org

We present a novel approach for multilingual document clustering using only comparable corpora to achieve cross-lingual semantic interoperability. The method models document colle...

Dani Yogatama, Kumiko Tanaka-Ishii

claim paper

Read More »

116

click to vote

TREC
2004

127views Information Technology» more TREC 2004»

Language Models for Searching in Web Corpora

15 years 3 months ago

Download trec.nist.gov

: We describe our participation in the TREC 2004 Web and Terabyte tracks. For the web track, we employ mixture language models based on document full-text, incoming anchortext, and...

Jaap Kamps, Gilad Mishne, Maarten de Rijke

claim paper

Read More »

108

click to vote

DOCENG
2009
ACM

139views Document Analysis» more DOCENG 2009»

Web document text and images extraction using DOM analysis and natural language processing

15 years 8 months ago

Download www.hpl.hp.com

: © Web Document Text and Images Extraction using DOM Analysis and Natural Language Processing Parag Mulendra Joshi, Sam Liu HP Laboratories HPL-2009-187 Web page text extraction,...

Parag Mulendra Joshi, Sam Liu

claim paper

Read More »

123

click to vote

WWW
2010
ACM

257views Internet Technology» more WWW 2010»

CETR: content extraction via tag ratios

15 years 9 months ago

Download www.cs.illinois.edu

We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...

Tim Weninger, William H. Hsu, Jiawei Han

claim paper

Read More »

click to vote

SPIRE
2010
Springer

114views Information Technology» more SPIRE 2010»

Hypergeometric Language Model and Zipf-Like Scoring Function for Web Document Similarity Retrieval

15 years 9 days ago

Download wi.dii.uchile.cl

The retrieval of similar documents in the Web from a given document is diﬀerent in many aspects from information retrieval based on queries generated by regular search engine use...

Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...

claim paper

Read More »

« Prev « First page 10 / 77 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers