Sciweavers

507 search results - page 59 / 102
» Using Text Mining and Link Analysis for Software Mining
Sort
View
KDD
2004
ACM
158views Data Mining» more  KDD 2004»
15 years 10 months ago
A generalized maximum entropy approach to bregman co-clustering and matrix approximation
Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an informationtheoretic ...
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho...
MSV
2004
14 years 11 months ago
MABAC - Matrix Based Clustering Algorithm
Clustering is a prominent method in the data mining field. It is a discovery process that groups data such that intra cluster similarity is maximized and the inter cluster similar...
Yonghui Chen, Alan P. Sprague, Kevin D. Reilly
SIGIR
2009
ACM
15 years 4 months ago
Identifying the original contribution of a document via language modeling
Abstract. One major goal of text mining is to provide automatic methods to help humans grasp the key ideas in ever-increasing text corpora. To this effect, we propose a statistica...
Benyah Shaparenko, Thorsten Joachims
WWW
2006
ACM
15 years 3 months ago
Do not crawl in the DUST: different URLs with similar text
We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
Uri Schonfeld, Ziv Bar-Yossef, Idit Keidar
ADVIS
2004
Springer
15 years 3 months ago
Multiple Sets of Rules for Text Categorization
An important issue in text mining is how to make use of multiple pieces knowledge discovered to improve future decisions. In this paper, we propose a new approach to combining mult...
Yaxin Bi, Terry J. Anderson, Sally I. McClean