Abstract Clustering text data streams is an important issue in data mining community and has a number of applications such as news group filtering, text crawling, document organiza...
Background: The development of high-throughput technologies such as yeast two-hybrid systems and mass spectrometry technologies has made it possible to generate large protein-prot...
Jianwen Fang, Ryan J. Haasl, Yinghua Dong, Gerald ...
Logistic Regression is a well-known classification method that has been used widely in many applications of data mining, machine learning, computer vision, and bioinformatics. Spa...
Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
This paper presents an interdisciplinary investigation of statistical information retrieval (IR) techniques for protein identification from tandem mass spectra, a challenging probl...