Search Sciweavers | Sciweavers

80 search results - page 11 / 16

» Web Page Segmentation Based on Gestalt Theory

click to vote

CICLING
2009
Springer

335views Natural Language Processing» more CICLING 2009»

Language Identification on the Web: Extending the Dictionary Method

15 years 3 months ago

Download www.fi.muni.cz

Abstract. Automated language identification of written text is a wellestablished research domain that has received considerable attention in the past. By now, efficient and effecti...

Radim Rehurek, Milan Kolkus

claim paper

Read More »

108

click to vote

KDD
1997
ACM

169views Data Mining» more KDD 1997»

Learning to Extract Text-Based Information from the World Wide Web

15 years 3 months ago

Download www.aaai.org

Thereis a wealthof informationto be minedfromnarrative text on the WorldWideWeb.Unfortunately, standard natural language processing (NLP)extraction techniques expect full, grammat...

Stephen Soderland

claim paper

Read More »

107

click to vote

WWW
2007
ACM

144views Internet Technology» more WWW 2007»

Towards domain-independent information extraction from web tables

16 years 6 days ago

Download www2007.org

Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...

Bernhard Krüpl, Bernhard Pollak, Marcus Herzo...

claim paper

Read More »

111

click to vote

WWW
2004
ACM

166views Internet Technology» more WWW 2004»

Link fusion: a unified link analysis framework for multi-type interrelated data objects

16 years 6 days ago

Download research.microsoft.com

Web link analysis has proven to be a significant enhancement for quality based web search. Most existing links can be classified into two categories: intra-type links (e.g., web h...

Wensi Xi, Benyu Zhang, Zheng Chen, Yizhou Lu, Shui...

claim paper

Read More »

click to vote

WWW
2004
ACM

134views Internet Technology» more WWW 2004»

Continuous web: a new image-based hypermedia and scape-oriented browsing

16 years 6 days ago

Download www.iw3c2.org

Conventionally, Web pages have been recognized as documents described by HTML. Image data, such as photographs, logos, maps, illustrations, and decorated text, have been treated a...

Hiroya Tanaka, Katsumi Tanaka

claim paper

Read More »

« Prev « First page 11 / 16 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers