Process-centered software engineering environments, such as Arcadia, impose a variety of requirements on database technology that to date have not been well supported by available...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
Identifying peptides, which are short polymeric chains of amino acid residues in a protein sequence, is of fundamental importance in systems biology research. The most popular appr...
This paper describes an application of IR and text categorization methods to a highly practical problem in biomedicine, specifically, Gene Ontology (GO) annotation. GO annotation...
Facebook recently deployed Facebook Messages, its first ever user-facing application built on the Apache Hadoop platform. Apache HBase is a database-like layer built on Hadoop des...
Dhruba Borthakur, Jonathan Gray, Joydeep Sen Sarma...