Web-based applications are one of the most widely used types of software, and have become the backbone of many e-commerce and communications businesses. These applications are ofte...
Kinga Dobolyi, Elizabeth Soechting, Westley Weimer
Most approaches to classifying media content assume a fixed, closed vocabulary of labels. In contrast, we advocate machine learning approaches which take advantage of the millions...
Record linkage is the problem of identifying similar records across different data sources. The similarity between two records is defined based on domain-specific similarity functi...
Recent years have seen growing interest in effective algorithms for summarizing and querying massive, high-speed data streams. Randomized sketch synopses provide accurate approxima...
Graham Cormode, Minos N. Garofalakis, Dimitris Sac...
Response time delays caused by I/O are a major problem in many systems and database applications. Prefetching and cache replacement methods are attracting renewed attention because...