The ongoing revolution in life sciences research is producing vast amounts of genetic and proteomic sequence data. Scientists want to pose increasingly complex queries on this dat...
Sandeep Tata, Jignesh M. Patel, James S. Friedman,...
Clustering time series is a problem that has applications in a wide variety of fields, and has recently attracted a large amount of research. In this paper we focus on clustering...
The problem of privacy-preserving data mining has been studied extensively in recent years because of the increased amount of personal information which is available to corporation...
Background: Integrating data from multiple global assays and curated databases is essential to understand the spatiotemporal interactions within cells. Different experiments measu...
Yuji Zhang, Jianhua Xuan, Benildo de los Reyes, Ro...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...