Search Sciweavers | Sciweavers

241 search results - page 28 / 49

» Detecting Co-Derivative Documents in Large Text Collections

146

click to vote

AIRWEB
2006
Springer

136views Internet Technology» more AIRWEB 2006»

Tracking Web Spam with Hidden Style Similarity

15 years 10 months ago

Download airweb.cse.lehigh.edu

Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g. commercial sites, blogs and other sites powered...

Tanguy Urvoy, Thomas Lavergne, Pascal Filoche

claim paper

Read More »

150

Voted

ICDAR
2003
IEEE

127views Document Analysis» more ICDAR 2003»

Identifying Story and Preview Images in News Web Pages

15 years 11 months ago

Download www.cse.salford.ac.uk

The World Wide Web provides an increasingly powerful and popular publication mechanism. Web documents often contain a large number of images serving various different purposes. Th...

Jianying Hu, Amit Bagga

claim paper

Read More »

196

click to vote

CIKM
2008
Springer

138views Information Technology» more CIKM 2008»

Identifying table boundaries in digital documents via sparse line detection

15 years 8 months ago

Download chemxseer.ist.psu.edu

Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...

Ying Liu, Prasenjit Mitra, C. Lee Giles

claim paper

Read More »

158

click to vote

WWW
2007
ACM

149views Internet Technology» more WWW 2007»

Query-driven indexing for peer-to-peer text retrieval

16 years 6 months ago

Download www2007.org

We describe a query-driven indexing framework for scalable text retrieval over structured P2P networks. To cope with the bandwidth consumption problem that has been identified as ...

Gleb Skobeltsyn, Toan Luu, Karl Aberer, Martin Raj...

claim paper

Read More »

183

click to vote

SIGIR
2005
ACM

176views Information Technology» more SIGIR 2005»

Indexing emails and email threads for retrieval

15 years 11 months ago

Download terpconnect.umd.edu

Electronic mail poses a number of unusual challenges for the design of information retrieval systems and test collections, including informal expression, conversational structure,...

Yejun Wu, Douglas W. Oard

claim paper

Read More »

« Prev « First page 28 / 49 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers