Sciweavers

1014 search results - page 21 / 203
» Using Keyword Extraction for Web Site Clustering
Sort
View
AIRWEB
2007
Springer
15 years 3 months ago
Extracting Link Spam using Biased Random Walks from Spam Seed Sets
Link spam deliberately manipulates hyperlinks between web pages in order to unduly boost the search engine ranking of one or more target pages. Link based ranking algorithms such ...
Baoning Wu, Kumar Chellapilla
ISCIS
2009
Springer
15 years 2 months ago
PopulusLog: People information database
—Information about individuals on publicly available web sites stands as a valuable, yet unorganized, data source. Turning such an enormous data source into a “database” is h...
Ali Cakmak, Mustafa Kirac, Gultekin Özsoyoglu
80
Voted
WWW
2007
ACM
15 years 10 months ago
U-REST: an unsupervised record extraction system
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
Yuan Kui Shen, David R. Karger
AAAI
2008
14 years 12 months ago
An Unsupervised Approach for Product Record Normalization across Different Web Sites
An unsupervised probabilistic learning framework for normalizing product records across different retailer Web sites is presented. Our framework decomposes the problem into two ta...
Tak-Lam Wong, Tik-Shun Wong, Wai Lam
JUCS
2008
123views more  JUCS 2008»
14 years 9 months ago
Exploring Information Extraction Resilience
: There are many challenges developers face when attempting to reliably extract data from the Web. One of these challenges is the resilience of the extraction system to changes in ...
Dawn G. Gregg