Sciweavers

609 search results - page 16 / 122
» Adaptive record extraction from web pages
Sort
View
WWW
2004
ACM
16 years 11 days ago
Testbed for information extraction from deep web
Search results generated by searchable databases are served dynamically and far larger than the static documents on the Web. These results pages have been referred to as the Deep ...
Yasuhiro Yamada, Nick Craswell, Tetsuya Nakatoh, S...
85
Voted
WWW
2007
ACM
16 years 11 days ago
EPCI: extracting potentially copyright infringement texts from the web
In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
WWW
2005
ACM
16 years 11 days ago
Extracting semantic structure of web documents using content and visual information
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
Rupesh R. Mehta, Pabitra Mitra, Harish Karnick
HT
2003
ACM
15 years 5 months ago
Extracting evolution of web communities from a series of web archives
Recent advances in storage technology make it possible to store a series of large Web archives. It is now an exciting challenge for us to observe evolution of the Web. In this pap...
Masashi Toyoda, Masaru Kitsuregawa
WWW
2006
ACM
16 years 11 days ago
Finding advertising keywords on web pages
A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of rev...
Wen-tau Yih, Joshua Goodman, Vitor R. Carvalho