Sciweavers

3395 search results - page 672 / 679
» A Statistical Clustering Model and Algorithm
Sort
View
KDD
2009
ACM
180views Data Mining» more  KDD 2009»
16 years 2 months ago
Mining social networks for personalized email prioritization
Email is one of the most prevalent communication tools today, and solving the email overload problem is pressingly urgent. A good way to alleviate email overload is to automatical...
Shinjae Yoo, Yiming Yang, Frank Lin, Il-Chul Moon
125
Voted
KDD
2008
ACM
174views Data Mining» more  KDD 2008»
16 years 2 months ago
Automatic identification of quasi-experimental designs for discovering causal knowledge
Researchers in the social and behavioral sciences routinely rely on quasi-experimental designs to discover knowledge from large databases. Quasi-experimental designs (QEDs) exploi...
David D. Jensen, Andrew S. Fast, Brian J. Taylor, ...
119
Voted
KDD
2007
ACM
136views Data Mining» more  KDD 2007»
16 years 2 months ago
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases
We now have incrementally-grown databases of text documents ranging back for over a decade in areas ranging from personal email, to news-articles and conference proceedings. While...
Benyah Shaparenko, Thorsten Joachims
KDD
2005
ACM
125views Data Mining» more  KDD 2005»
16 years 2 months ago
Email data cleaning
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Jie Tang, Hang Li, Yunbo Cao, ZhaoHui Tang
KDD
2001
ACM
163views Data Mining» more  KDD 2001»
16 years 2 months ago
The "DGX" distribution for mining massive, skewed data
Skewed distributions appear very often in practice. Unfortunately, the traditional Zipf distribution often fails to model them well. In this paper, we propose a new probability di...
Zhiqiang Bi, Christos Faloutsos, Flip Korn