Sciweavers

571 search results - page 64 / 115
» Testing homogeneity of a large data set by bootstrapping
Sort
View
KDD
2005
ACM
118views Data Mining» more  KDD 2005»
16 years 7 days ago
On the use of linear programming for unsupervised text classification
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Mark Sandler
ICDT
2003
ACM
106views Database» more  ICDT 2003»
15 years 5 months ago
Processing XML Streams with Deterministic Automata
We consider the problem of evaluating a large number of XPath expressions on an XML stream. Our main contribution consists in showing that Deterministic Finite Automata (DFA) can b...
Todd J. Green, Gerome Miklau, Makoto Onizuka, Dan ...
JCB
2008
94views more  JCB 2008»
14 years 11 months ago
Prioritize and Select SNPs for Association Studies with Multi-Stage Designs
Large-scale whole genome association studies are increasingly common, due in large part to recent advances in genotyping technology. With this change in paradigm for genetic studi...
Jing Li
ECIR
2004
Springer
15 years 1 months ago
Performance Analysis of Distributed Architectures to Index One Terabyte of Text
We simulate different architectures of a distributed Information Retrieval system on a very large Web collection, in order to work out the optimal setting for a particular set of r...
Fidel Cacheda, Vassilis Plachouras, Iadh Ounis
EDBT
2009
ACM
132views Database» more  EDBT 2009»
15 years 6 months ago
A novel approach for efficient supergraph query processing on graph databases
In recent years, large amount of data modeled by graphs, namely graph data, have been collected in various domains. Efficiently processing queries on graph databases has attracted...
Shuo Zhang, Jianzhong Li, Hong Gao, Zhaonian Zou