The tools used to analyze scientific data are often distinct from those used to archive, retrieve, and query data. A scientific workflow environment, however, allows one to seamles...
Chad Berkley, Shawn Bowers, Matthew B. Jones, Bert...
We address the fundamental question: what does it mean for data in a database to be of high quality? We motivate our discussion with examples, where traditional views on data quali...
Most datasets in real applications come in from multiple sources. As a result, we often have attributes information about data objects and various pairwise relations (similarity) ...
We present a method to discover robust and interpretable sociolinguistic associations from raw geotagged text data. Using aggregate demographic statistics about the authors’ geo...
Background: Ensemble attribute profile clustering is a novel, text-based strategy for analyzing a userdefined list of genes and/or proteins. The strategy exploits annotation data ...
J. R. Semeiks, A. Rizki, Mina J. Bissell, I. Saira...