Sciweavers

WWW
2005
ACM
14 years 5 months ago
Mining web site's topic hierarchy
Searching and navigating a Web site is a tedious task and the hierarchical models, such as site maps, are frequently used for organizing the Web site's content. In this work,...
Nan Liu, C. Yang
WWW
2005
ACM
14 years 5 months ago
Web data cleansing for information retrieval using key resource page selection
With the page explosion of WWW, how to cover more useful information with limited storage and computation resources becomes more and more important in web IR research. Using web p...
Yiqun Liu, Canhui Wang, Min Zhang, Shaoping Ma
WWW
2005
ACM
14 years 5 months ago
Site abstraction for rare category classification in large-scale web directory
Tie-Yan Liu, Hao Wan, Tao Qin, Zheng Chen, Yong Re...
WWW
2005
ACM
14 years 5 months ago
Using visual cues for extraction of tabular data from arbitrary HTML documents
We describe a method to extract tabular data from web pages. Rather than just analyzing the DOM tree, we also exploit visual cues in the rendered version of the document to extrac...
Bernhard Krüpl, Marcus Herzog, Wolfgang Gatte...
WWW
2005
ACM
14 years 5 months ago
Clustering for probabilistic model estimation for CF
Based on the type of collaborative objects, a collaborative filtering (CF) system falls into one of two categories: item-based CF and user-based CF. Clustering is the basic idea i...
Qing Li, Byeong Man Kim, Sung-Hyon Myaeng
WWW
2005
ACM
14 years 5 months ago
Hubble: an advanced dynamic folder system for XML
Organizing large document collections for finding information easily and quickly has always been an important user requirement. This paper describes a flexible and powerful dynami...
Ning Li, Joshua Hui, Hui-I Hsiao, Kevin S. Beyer
WWW
2005
ACM
14 years 5 months ago
Consistency checking of UML model diagrams using the XML semantics approach
A software design is often modeled as a collection of unified Modeling Language (UML) diagrams. There are different aspects of the software system that are covered by many differe...
Yasser Kotb, Takuya Katayama
WWW
2005
ACM
14 years 5 months ago
Focused crawling by exploiting anchor text using decision tree
Focused crawlers are considered as a promising way to tackle the scalability problem of topic-oriented or personalized search engines. To design a focused crawler, the choice of s...
Jun Li, Kazutaka Furuse, Kazunori Yamaguchi
WWW
2005
ACM
14 years 5 months ago
SLL: running my web services on your WS platforms
Today, the choice for a particular programming language limits the alternative products that can be used to deploy the program. The purpose of this work is to break the strong tie...
Donald Kossmann, Christian Reichel
WWW
2005
ACM
14 years 5 months ago
Enhancing the privacy of web-based communication
A profiling adversary is an adversary whose goal is to classify a population of users into categories according to messages they exchange. This adversary models the most common pr...
Aleksandra Korolova, Ayman Farahat, Philippe Golle