— Monitoring the traffic in high-speed networks is a data intensive problem. Uniform packet sampling is the most popular technique for reducing the amount of data the network mo...
The goal of distributed information retrieval is to support effective searching over multiple document collections. For efficiency, queries should be routed to only those collectio...
: The generic problem of estimation and inference given a sequence of i.i.d. samples has been extensively studied in the statistics, property testing, and learning communities. A n...
We present the first PAC bounds for learning parameters of Conditional Random Fields [12] with general structures over discrete and real-valued variables. Our bounds apply to com...
Background: In genetic transcription research, gene expression is typically reported in a test sample relative to a reference sample. Laboratory assays that measure gene expressio...