Sciweavers

ICDM
2002
IEEE
162views Data Mining» more  ICDM 2002»
13 years 10 months ago
Phrase-based Document Similarity Based on an Index Graph Model
Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
Khaled M. Hammouda, Mohamed S. Kamel
ICDM
2002
IEEE
111views Data Mining» more  ICDM 2002»
13 years 10 months ago
An Algebraic Approach to Data Mining: Some Examples
In this paper, we introduce an algebraic approach to the foundations of data mining. Our approach is based upon two algebras of functions de ned over a common state space X and a ...
Robert L. Grossman, Richard G. Larson
ICDM
2002
IEEE
93views Data Mining» more  ICDM 2002»
13 years 10 months ago
Learning from Order Examples
We advocate a new learning task that deals with orders of items, and we call this the Learning from Order Examples (LOE) task. The aim of the task is to acquire the rule that is u...
Toshihiro Kamishima, Shotaro Akaho
ICDM
2002
IEEE
109views Data Mining» more  ICDM 2002»
13 years 10 months ago
Using Text Mining to Infer Semantic Attributes for Retail Data Mining
Current Data Mining techniques usually do not have a mechanism to automatically infer semantic features inherent in the data being “mined”. The semantics are either injected i...
Rayid Ghani, Andrew E. Fano
ICDM
2002
IEEE
70views Data Mining» more  ICDM 2002»
13 years 10 months ago
Progressive Modeling
Presently, inductive learning is still performed in a frustrating batch process. The user has little interaction with the system and no control over the final accuracy and traini...
Wei Fan, Haixun Wang, Philip S. Yu, Shaw-hwa Lo, S...
ICDM
2002
IEEE
91views Data Mining» more  ICDM 2002»
13 years 10 months ago
Mining Molecular Fragments: Finding Relevant Substructures of Molecules
We present an algorithm to find fragments in a set of molecules that help to discriminate between different classes of, for instance, activity in a drug discovery context. Instea...
Christian Borgelt, Michael R. Berthold
ICDM
2002
IEEE
163views Data Mining» more  ICDM 2002»
13 years 10 months ago
High Performance Data Mining Using the Nearest Neighbor Join
The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such that the r...
Christian Böhm, Florian Krebs
ICDM
2002
IEEE
158views Data Mining» more  ICDM 2002»
13 years 10 months ago
Adaptive dimension reduction for clustering high dimensional data
It is well-known that for high dimensional data clustering, standard algorithms such as EM and the K-means are often trapped in local minimum. Many initialization methods were pro...
Chris H. Q. Ding, Xiaofeng He, Hongyuan Zha, Horst...
ICDM
2002
IEEE
191views Data Mining» more  ICDM 2002»
13 years 10 months ago
Iterative Clustering of High Dimensional Text Data Augmented by Local Search
The k-means algorithm with cosine similarity, also known as the spherical k-means algorithm, is a popular method for clustering document collections. However, spherical k-means ca...
Inderjit S. Dhillon, Yuqiang Guan, J. Kogan
ICDM
2002
IEEE
106views Data Mining» more  ICDM 2002»
13 years 10 months ago
Neighborgram Clustering Interactive Exploration of Cluster Neighborhoods
Proceedings of IEEE Data Mining, IEEE Press, pp. 581-584, 2002. We describe an interactive way to generate a set of clusters for a given data set. The clustering is done by constr...
Michael R. Berthold, Bernd Wiswedel, David E. Patt...