Sciweavers

241 search results - page 1 / 49
» Copyright page
Sort
View
WWW
2007
ACM
14 years 5 months ago
EPCI: extracting potentially copyright infringement texts from the web
In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
IJCAI
2003
13 years 6 months ago
Web Page Cleaning for Web Mining through Feature Weighting
Unlike conventional data or text, Web pages typically contain a large amount of information that is not part of the main contents of the pages, e.g., banner ads, navigation bars, ...
Lan Yi, Bing Liu
AUSDM
2006
Springer
160views Data Mining» more  AUSDM 2006»
13 years 8 months ago
Extraction of Flat and Nested Data Records from Web Pages
This paper deals with studies the problem of identification and extraction of flat and nested data records from a given web page. With the explosive growth of information sources ...
Siddu P. Algur, P. S. Hiremath
DEXA
2006
Springer
197views Database» more  DEXA 2006»
13 years 6 months ago
Cleaning Web Pages for Effective Web Content Mining
Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...
Jing Li, Christie I. Ezeife