Sciweavers

KDD
2008
ACM
146views Data Mining» more  KDD 2008»
14 years 4 months ago
Constraint programming for itemset mining
The relationship between constraint-based mining and constraint programming is explored by showing how the typical constraints used in pattern mining can be formulated for use in ...
Luc De Raedt, Tias Guns, Siegfried Nijssen
KDD
2008
ACM
234views Data Mining» more  KDD 2008»
14 years 4 months ago
Angle-based outlier detection in high-dimensional data
Detecting outliers in a large set of data objects is a major data mining task aiming at finding different mechanisms responsible for different groups of objects in a data set. All...
Hans-Peter Kriegel, Matthias Schubert, Arthur Zime...
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 4 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
KDD
2008
ACM
140views Data Mining» more  KDD 2008»
14 years 4 months ago
Semi-supervised approach to rapid and reliable labeling of large data sets
Supervised classification methods have been shown to be very effective for a large number of applications. They require a training data set whose instances are labeled to indicate...
György J. Simon, Vipin Kumar, Zhi-Li Zhang
KDD
2008
ACM
167views Data Mining» more  KDD 2008»
14 years 4 months ago
A sequential dual method for large scale multi-class linear svms
Efficient training of direct multi-class formulations of linear Support Vector Machines is very useful in applications such as text classification with a huge number examples as w...
S. Sathiya Keerthi, S. Sundararajan, Kai-Wei Chang...
KDD
2008
ACM
192views Data Mining» more  KDD 2008»
14 years 4 months ago
Partial least squares regression for graph mining
Attributed graphs are increasingly more common in many application domains such as chemistry, biology and text processing. A central issue in graph mining is how to collect inform...
Hiroto Saigo, Koji Tsuda, Nicole Krämer
KDD
2008
ACM
232views Data Mining» more  KDD 2008»
14 years 4 months ago
Anticipating annotations and emerging trends in biomedical literature
The BioJournalMonitor is a decision support system for the analysis of trends and topics in the biomedical literature. Its main goal is to identify potential diagnostic and therap...
Bernd Wachmann, Dmitriy Fradkin, Fabian Mörch...
KDD
2008
ACM
148views Data Mining» more  KDD 2008»
14 years 4 months ago
Get another label? improving data quality and data mining using multiple, noisy labelers
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated la...
Victor S. Sheng, Foster J. Provost, Panagiotis G. ...
KDD
2008
ACM
211views Data Mining» more  KDD 2008»
14 years 4 months ago
ArnetMiner: extraction and mining of academic social networks
This paper addresses several key issues in the ArnetMiner system, which aims at extracting and mining academic social networks. Specifically, the system focuses on: 1) Extracting ...
Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zha...
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
14 years 4 months ago
Context-aware query suggestion by mining click-through and session data
Query suggestion plays an important role in improving the usability of search engines. Although some recently proposed methods can make meaningful query suggestions by mining quer...
Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen L...