Sciweavers

488 search results - page 82 / 98
» General Database Statistics Using Entropy Maximization
Sort
View
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
15 years 10 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee
SAC
2009
ACM
15 years 4 months ago
Applying latent dirichlet allocation to group discovery in large graphs
This paper introduces LDA-G, a scalable Bayesian approach to finding latent group structures in large real-world graph data. Existing Bayesian approaches for group discovery (suc...
Keith Henderson, Tina Eliassi-Rad
SIGMOD
2001
ACM
104views Database» more  SIGMOD 2001»
15 years 9 months ago
Independence is Good: Dependency-Based Histogram Synopses for High-Dimensional Data
Approximating the joint data distribution of a multi-dimensional data set through a compact and accurate histogram synopsis is a fundamental problem arising in numerous practical ...
Amol Deshpande, Minos N. Garofalakis, Rajeev Rasto...
101
Voted
KDD
2000
ACM
101views Data Mining» more  KDD 2000»
15 years 1 months ago
Incremental quantile estimation for massive tracking
Data--call records, internet packet headers, or other transaction records--are coming down a pipe at a ferocious rate, and we need to monitor statistics of the data. There is no r...
Fei Chen, Diane Lambert, José C. Pinheiro
81
Voted
SIGSOFT
2010
ACM
14 years 7 months ago
The missing links: bugs and bug-fix commits
Empirical studies of software defects rely on links between bug databases and program code repositories. This linkage is typically based on bug-fixes identified in developer-enter...
Adrian Bachmann, Christian Bird, Foyzur Rahman, Pr...