Sciweavers

1013 search results - page 128 / 203
» Data Mining in the Bioinformatics Domain
Sort
View
KDD
2006
ACM
165views Data Mining» more  KDD 2006»
16 years 7 days ago
Outlier detection by sampling with accuracy guarantees
An effective approach to detect anomalous points in a data set is distance-based outlier detection. This paper describes a simple sampling algorithm to efficiently detect distance...
Mingxi Wu, Chris Jermaine
DMKD
2004
ACM
139views Data Mining» more  DMKD 2004»
15 years 5 months ago
Iterative record linkage for cleaning and integration
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
Indrajit Bhattacharya, Lise Getoor
KDD
2009
ACM
229views Data Mining» more  KDD 2009»
16 years 11 days ago
An association analysis approach to biclustering
The discovery of biclusters, which denote groups of items that show coherent values across a subset of all the transactions in a data set, is an important type of analysis perform...
Gaurav Pandey, Gowtham Atluri, Michael Steinbach, ...
SIGIR
2008
ACM
14 years 11 months ago
Topic-bridged PLSA for cross-domain text classification
In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain ...
Gui-Rong Xue, Wenyuan Dai, Qiang Yang, Yong Yu
BMCBI
2005
73views more  BMCBI 2005»
14 years 11 months ago
An analysis of extensible modelling for functional genomics data
Background: Several data formats have been developed for large scale biological experiments, using a variety of methodologies. Most data formats contain a mechanism for allowing e...
Andrew R. Jones, Norman W. Paton