Search Sciweavers | Sciweavers

311 search results - page 14 / 63

» Cleaning Web Pages for Effective Web Content Mining

click to vote

WWW
2007
ACM

114views Internet Technology» more WWW 2007»

Homepage live: automatic block tracing for web personalization

16 years 8 days ago

Download www2007.org

The emergence of personalized homepage services, e.g. personalized Google Homepage and Microsoft Windows Live, has enabled Web users to select Web contents of interest and to aggr...

Jie Han, Dingyi Han, Chenxi Lin, Hua-Jun Zeng, Zhe...

claim paper

Read More »

click to vote

SIGIR
2004
ACM

112views Information Technology» more SIGIR 2004»

Web-a-where: geotagging web content

15 years 5 months ago

Download einat.webir.org

We describe Web-a-Where, a system for associating geography with Web pages. Web-a-Where locates mentions of places and determines the place each name refers to. In addition, it as...

Einat Amitay, Nadav Har'El, Ron Sivan, Aya Soffer

claim paper

Read More »

105

click to vote

WIRI
2005
IEEE

117views Internet Technology» more WIRI 2005»

Extended Link Analysis for Extracting Spatial Information Hubs

15 years 5 months ago

Download www.db.itc.nagoya-u.ac.jp

Recently, web mining that tries to ﬁnd useful knowledge from the vast amount of web pages has attracted a lot of research interests. Besides, it is becoming an essential task to...

Jianwei Zhang 0002, Yoshiharu Ishikawa, Hiroyuki K...

claim paper

Read More »

click to vote

WWW
2003
ACM

139views Internet Technology» more WWW 2003»

Detecting Near-replicas on the Web by Content and Hyperlink Analysis

16 years 8 days ago

Download nautilus.dii.unisi.it

The presence of replicas or near-replicas of documents is very common on the Web. Documents may be replicated completely or partially for different reasons (versions, mirrors, etc...

Ernesto Di Iorio, Michelangelo Diligenti, Marco Go...

claim paper

Read More »

174

Voted

ICDE
2004
IEEE

117views Database» more ICDE 2004»

Probe, Cluster, and Discover: Focused Extraction of QA-Pagelets from the Deep Web

16 years 29 days ago

Download www.cc.gatech.edu

In this paper, we introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient min...

James Caverlee, Ling Liu, David Buttler

claim paper

Read More »

« Prev « First page 14 / 63 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers