Sciweavers

KDD
2007
ACM
165views Data Mining» more  KDD 2007»
14 years 5 months ago
Efficient and effective explanation of change in hierarchical summaries
Dimension attributes in data warehouses are typically hierarchical (e.g., geographic locations in sales data, URLs in Web traffic logs). OLAP tools are used to summarize the measu...
Deepak Agarwal, Dhiman Barman, Dimitrios Gunopulos...
KDD
2007
ACM
231views Data Mining» more  KDD 2007»
14 years 5 months ago
Xproj: a framework for projected structural clustering of xml documents
XML has become a popular method of data representation both on the web and in databases in recent years. One of the reasons for the popularity of XML has been its ability to encod...
Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua F...
KDD
2007
ACM
141views Data Mining» more  KDD 2007»
14 years 5 months ago
Mining favorable facets
The importance of dominance and skyline analysis has been well recognized in multi-criteria decision making applications. Most previous studies assume a fixed order on the attribu...
Raymond Chi-Wing Wong, Jian Pei, Ada Wai-Chee Fu, ...
KDD
2007
ACM
193views Data Mining» more  KDD 2007»
14 years 5 months ago
Joint optimization of wrapper generation and template detection
Many websites have large collections of pages generated dynamically from an underlying structured source like a database. The data of a category are typically encoded into similar...
Shuyi Zheng, Ruihua Song, Ji-Rong Wen, Di Wu
KDD
2007
ACM
155views Data Mining» more  KDD 2007»
14 years 5 months ago
Mining templates from search result records of search engines
Metasearch engine, Comparison-shopping and Deep Web crawling applications need to extract search result records enwrapped in result pages returned from search engines in response ...
Hongkun Zhao, Weiyi Meng, Clement T. Yu
KDD
2007
ACM
376views Data Mining» more  KDD 2007»
14 years 5 months ago
Truth discovery with multiple conflicting information providers on the web
The world-wide web has become the most important information source for most of us. Unfortunately, there is no guarantee for the correctness of information on the web. Moreover, d...
Xiaoxin Yin, Jiawei Han, Philip S. Yu
KDD
2007
ACM
167views Data Mining» more  KDD 2007»
14 years 5 months ago
Generalized component analysis for text with heterogeneous attributes
We present a class of richly structured, undirected hidden variable models suitable for simultaneously modeling text along with other attributes encoded in different modalities. O...
Xuerui Wang, Chris Pal, Andrew McCallum
KDD
2007
ACM
189views Data Mining» more  KDD 2007»
14 years 5 months ago
Corroborate and learn facts from the web
The web contains lots of interesting factual information about entities, such as celebrities, movies or products. This paper describes a robust bootstrapping approach to corrobora...
Shubin Zhao, Jonathan Betz
KDD
2007
ACM
148views Data Mining» more  KDD 2007»
14 years 5 months ago
Scalable look-ahead linear regression trees
Most decision tree algorithms base their splitting decisions on a piecewise constant model. Often these splitting algorithms are extrapolated to trees with non-constant models at ...
David S. Vogel, Ognian Asparouhov, Tobias Scheffer