Sciweavers

KDD
2004
ACM
179views Data Mining» more  KDD 2004»
14 years 4 months ago
1-dimensional splines as building blocks for improving accuracy of risk outcomes models
Transformation of both the response variable and the predictors is commonly used in fitting regression models. However, these transformation methods do not always provide the maxi...
David S. Vogel, Morgan C. Wang
KDD
2004
ACM
182views Data Mining» more  KDD 2004»
14 years 4 months ago
Rotation invariant distance measures for trajectories
For the discovery of similar patterns in 1D time-series, it is very typical to perform a normalization of the data (for example a transformation so that the data follow a zero mea...
Michail Vlachos, Dimitrios Gunopulos, Gautam Das
KDD
2004
ACM
139views Data Mining» more  KDD 2004»
14 years 4 months ago
Learning a complex metabolomic dataset using random forests and support vector machines
Metabolomics is the omics science of biochemistry. The associated data include the quantitative measurements of all small molecule metabolites in a biological sample. These datase...
Young Truong, Xiaodong Lin, Chris Beecher
KDD
2004
ACM
127views Data Mining» more  KDD 2004»
14 years 4 months ago
A generative probabilistic approach to visualizing sets of symbolic sequences
There is a notable interest in extending probabilistic generative modeling principles to accommodate for more complex structured data types. In this paper we develop a generative ...
Peter Tiño, Ata Kabán, Yi Sun
KDD
2004
ACM
164views Data Mining» more  KDD 2004»
14 years 4 months ago
Ordering patterns by combining opinions from multiple sources
Pattern ordering is an important task in data mining because the number of patterns extracted by standard data mining algorithms often exceeds our capacity to manually analyze the...
Pang-Ning Tan, Rong Jin
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
14 years 4 months ago
Probabilistic author-topic models for information discovery
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
KDD
2004
ACM
110views Data Mining» more  KDD 2004»
14 years 4 months ago
Generalizing the notion of support
The goal of this paper is to show that generalizing the notion of support can be useful in extending association analysis to non-traditional types of patterns and non-binary data....
Michael Steinbach, Pang-Ning Tan, Hui Xiong, Vipin...
KDD
2004
ACM
124views Data Mining» more  KDD 2004»
14 years 4 months ago
Support envelopes: a technique for exploring the structure of association patterns
This paper introduces support envelopes--a new tool for analyzing association patterns--and illustrates some of their properties, applications, and possible extensions. Specifical...
Michael Steinbach, Pang-Ning Tan, Vipin Kumar
KDD
2004
ACM
146views Data Mining» more  KDD 2004»
14 years 4 months ago
A Bayesian network framework for reject inference
Andrew T. Smith, Charles Elkan
KDD
2004
ACM
126views Data Mining» more  KDD 2004»
14 years 4 months ago
Selection, combination, and evaluation of effective software sensors for detecting abnormal computer usage
We present and empirically analyze a machine-learning approach for detecting intrusions on individual computers. Our Winnowbased algorithm continually monitors user and system beh...
Jude W. Shavlik, Mark Shavlik