Sciweavers

1950 search results - page 176 / 390
» Informative sampling for large unbalanced data sets
Sort
View
155
Voted
BMCBI
2010
165views more  BMCBI 2010»
15 years 3 months ago
Bayesian integrated modeling of expression data: a case study on RhoG
Background: DNA microarrays provide an efficient method for measuring activity of genes in parallel and even covering all the known transcripts of an organism on a single array. T...
Rashi Gupta, Dario Greco, Petri Auvinen, Elja Arja...
119
Voted
APPROX
2008
Springer
101views Algorithms» more  APPROX 2008»
15 years 5 months ago
Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity
Clustering is a common problem in the analysis of large data sets. Streaming algorithms, which make a single pass over the data set using small working memory and produce a cluster...
Richard Matthew McCutchen, Samir Khuller
136
Voted
NIPS
2004
15 years 5 months ago
Using Random Forests in the Structured Language Model
In this paper, we explore the use of Random Forests (RFs) in the structured language model (SLM), which uses rich syntactic information in predicting the next word based on words ...
Peng Xu, Frederick Jelinek
142
Voted
CIKM
2000
Springer
15 years 8 months ago
Scalable association-based text classification
Naïve Bayes (NB) classifier has long been considered a core methodology in text classification mainly due to its simplicity and computational efficiency. There is an increasing n...
Dimitris Meretakis, Dimitris Fragoudis, Hongjun Lu...
SIGIR
2008
ACM
15 years 3 months ago
A user browsing model to predict search engine click data from past observations
Search engine click logs provide an invaluable source of relevance information but this information is biased because we ignore which documents from the result list the users have...
Georges Dupret, Benjamin Piwowarski