Sciweavers

APWEB
2005
Springer
13 years 6 months ago
An Empirical Study on the Change of Web Pages
As web pages are created, destroyed, and updated dynamically, web databases should be frequently updated to keep web pages up-to-date. Understanding the change behavior of web page...
Sung Jin Kim, Sang Ho Lee
CIKM
2008
Springer
13 years 6 months ago
Achieving both high precision and high recall in near-duplicate detection
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
Lian'en Huang, Lei Wang, Xiaoming Li
APWEB
2008
Springer
13 years 6 months ago
A Method for Web Information Extraction
The Word Wide Web has becoming one of the most important information repositories. However, information in web pages is free of standards in presentation, without being organized i...
Man I. Lam, Zhiguo Gong, Maybin K. Muyeba
AVI
2008
13 years 7 months ago
A haptic rendering engine of web pages for blind users
To overcome the shortcomings posed by audio rendering of web pages for blind users, this paper implements an interaction technique where web pages are parsed so as to automaticall...
Nikolaos Kaklanis, Juan Manuel González-Cal...
AAAI
2007
13 years 7 months ago
Mining Web Query Hierarchies from Clickthrough Data
In this paper, we propose to mine query hierarchies from clickthrough data, which is within the larger area of automatic acquisition of knowledge from the Web. When a user submits...
Dou Shen, Min Qin, Weizhu Chen, Qiang Yang, Zheng ...
VLDB
2000
ACM
104views Database» more  VLDB 2000»
13 years 8 months ago
The Evolution of the Web and Implications for an Incremental Crawler
In this paper we study how to build an effective incremental crawler. The crawler selectively and incrementally updates its index and/or local collection of web pages, instead of ...
Junghoo Cho, Hector Garcia-Molina
CBMS
2001
IEEE
13 years 8 months ago
Web Page Downloading and Classification
This paper describes the processes of downloading and classifying Web-based articles in online medical journals as a preliminary step to extracting bibliographic data to populate ...
Loc Q. Tran, Chan W. Moon, Daniel X. Le, George R....
DEXA
2006
Springer
151views Database» more  DEXA 2006»
13 years 8 months ago
Personalized Detection of Fresh Content and Temporal Annotation for Improved Page Revisiting
Abstract. Page revisiting is a popular browsing activity in the Web. In this paper we describe a method for improving page revisiting by detecting and highlighting the information ...
Adam Jatowt, Yukiko Kawai, Katsumi Tanaka
AUSDM
2006
Springer
160views Data Mining» more  AUSDM 2006»
13 years 8 months ago
Extraction of Flat and Nested Data Records from Web Pages
This paper deals with studies the problem of identification and extraction of flat and nested data records from a given web page. With the explosive growth of information sources ...
Siddu P. Algur, P. S. Hiremath
APWEB
2006
Springer
13 years 8 months ago
LocalRank: A Prototype for Ranking Web Pages with Database Considering Geographical Locality
In this demo, we present a method called LocalRank to rank web pages. Our method integrates the web and a local user database with semantic links including geographical ones. We fi...
Jianwei Zhang 0002, Yoshiharu Ishikawa, Sayumi Kur...