Search Sciweavers | Sciweavers

609 search results - page 29 / 122

» Adaptive record extraction from web pages

102

click to vote

ICMCS
2005
IEEE

89views Multimedia» more ICMCS 2005»

Semantic Knowledge Building for Image Database by Analyzing Web Page Contents

15 years 7 months ago

Download www.cecs.uci.edu

In this paper, we present a method of semantic knowledge building for image database by extracting semantic meanings from Web page contents. The novelty of our method is that it i...

Yung-Kwang Lai, Song Liu, Liang-Tien Chia, Syin Ch...

claim paper

Read More »

121

Voted

WIDM
2003
ACM

130views Internet Technology» more WIDM 2003»

Datarover: a taxonomy based crawler for automated data extraction from data-intensive websites

15 years 7 months ago

Download www.public.asu.edu

The advent of e-commerce has created a trend that brought thousands of catalogs online. Most of these websites are “taxonomy-directed”. A Web site is said to be ``taxonomydire...

Hasan Davulcu, S. Koduri, Saravanakumar Nagarajan

claim paper

Read More »

147

click to vote

WWW
2010
ACM

300views Internet Technology» more WWW 2010»

Automatic extraction of clickable structured web contents for name entity queries

15 years 9 months ago

Download research.microsoft.com

Today the major web search engines answer queries by showing ten result snippets, which need to be inspected by users for identifying relevant results. In this paper we investigat...

Xiaoxin Yin, Wenzhao Tan, Xiao Li, Yi-Chin Tu

claim paper

Read More »

138

click to vote

AIRWEB
2007
Springer

214views Internet Technology» more AIRWEB 2007»

Extracting Link Spam using Biased Random Walks from Spam Seed Sets

15 years 8 months ago

Download airweb.cse.lehigh.edu

Link spam deliberately manipulates hyperlinks between web pages in order to unduly boost the search engine ranking of one or more target pages. Link based ranking algorithms such ...

Baoning Wu, Kumar Chellapilla

claim paper

Read More »

108

click to vote

LREC
2008

108views Education» more LREC 2008»

A Lightweight and Efficient Tool for Cleaning Web Pages

15 years 3 months ago

Download www.lrec-conf.org

Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...

Stefan Evert

claim paper

Read More »

« Prev « First page 29 / 122 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers