Life sciences researchers perform scientific literature search as part of their daily activities. Many such searches are executed against PubMed, a central repository of life sci...
Julia Stoyanovich, Mayur Lodha, William Mee, Kenne...
Modern enterprise, web, and multimedia applications are generating unstructured content at unforeseen volumes in the form of documents, texts, and media files. Such content is gen...
Krishna Kunchithapadam, Wei Zhang, Amit Ganesh, Ni...
Today’s one-pass analytics applications tend to be data-intensive in nature and require the ability to process high volumes of data efficiently. MapReduce is a popular programm...
Boduo Li, Edward Mazur, Yanlei Diao, Andrew McGreg...
Central to a data cleaning system are record matching and data repairing. Matching aims to identify tuples that refer to the same real-world object, and repairing is to make a dat...
Wenfei Fan, Jianzhong Li, Shuai Ma, Nan Tang, Weny...
Multitenant data infrastructures for large cloud platforms hosting hundreds of thousands of applications face the challenge of serving applications characterized by small data foo...
Aaron J. Elmore, Sudipto Das, Divyakant Agrawal, A...