Sciweavers

609 search results - page 29 / 122
» Adaptive record extraction from web pages
Sort
View
ICMCS
2005
IEEE
89views Multimedia» more  ICMCS 2005»
15 years 7 months ago
Semantic Knowledge Building for Image Database by Analyzing Web Page Contents
In this paper, we present a method of semantic knowledge building for image database by extracting semantic meanings from Web page contents. The novelty of our method is that it i...
Yung-Kwang Lai, Song Liu, Liang-Tien Chia, Syin Ch...
121
Voted
WIDM
2003
ACM
15 years 7 months ago
Datarover: a taxonomy based crawler for automated data extraction from data-intensive websites
The advent of e-commerce has created a trend that brought thousands of catalogs online. Most of these websites are “taxonomy-directed”. A Web site is said to be ``taxonomydire...
Hasan Davulcu, S. Koduri, Saravanakumar Nagarajan
WWW
2010
ACM
15 years 9 months ago
Automatic extraction of clickable structured web contents for name entity queries
Today the major web search engines answer queries by showing ten result snippets, which need to be inspected by users for identifying relevant results. In this paper we investigat...
Xiaoxin Yin, Wenzhao Tan, Xiao Li, Yi-Chin Tu
AIRWEB
2007
Springer
15 years 8 months ago
Extracting Link Spam using Biased Random Walks from Spam Seed Sets
Link spam deliberately manipulates hyperlinks between web pages in order to unduly boost the search engine ranking of one or more target pages. Link based ranking algorithms such ...
Baoning Wu, Kumar Chellapilla
LREC
2008
108views Education» more  LREC 2008»
15 years 3 months ago
A Lightweight and Efficient Tool for Cleaning Web Pages
Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...
Stefan Evert