Sciweavers

311 search results - page 42 / 63
» Cleaning Web Pages for Effective Web Content Mining
Sort
View
VLDB
1999
ACM
140views Database» more  VLDB 1999»
15 years 1 months ago
Distributed Hypertext Resource Discovery Through Examples
We describe the architecture of a hypertext resource discovery system using a relational database. Such a system can answer questions that combine page contents, metadata, and hyp...
Soumen Chakrabarti, Martin van den Berg, Byron Dom
KDD
2006
ACM
253views Data Mining» more  KDD 2006»
15 years 10 months ago
Adaptive Website Design Using Caching Algorithms
Visitors enter a website through a variety of means, including web searches, links from other sites, and personal bookmarks. In some cases the first page loaded satisfies the visi...
Justin Brickell, Inderjit S. Dhillon, Dharmendra S...
CN
1999
242views more  CN 1999»
14 years 9 months ago
Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery
The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext resource d...
Soumen Chakrabarti, Martin van den Berg, Byron Dom
WWW
2009
ACM
15 years 10 months ago
Mining multilingual topics from wikipedia
In this paper, we try to leverage a large-scale and multilingual knowledge base, Wikipedia, to help effectively analyze and organize Web information written in different languages...
Xiaochuan Ni, Jian-Tao Sun, Jian Hu, Zheng Chen
HT
2010
ACM
14 years 11 months ago
Assessing users' interactions for clustering web documents: a pragmatic approach
In this paper we are interested in describing Web pages by how users interact within their contents. Thus, an alternate but complementary way of labelling and classifying Web docu...
Luis A. Leiva, Enrique Vidal