Today it is possible to deploy sensor networks in the real world and collect large amounts of raw sensory data. However, it remains a major challenge to make sense of sensor data, ...
This paper presents a systematic approach to mine colocation patterns in Sloan Digital Sky Survey (SDSS) data. SDSS Data Release 5 (DR5) contains 3.6 TB of data. Availability of s...
This paper addresses the problem of discovering conversational group dynamics from nonverbal cues extracted from thin-slices of interaction. We first propose and analyze a novel t...
We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...
A substantial subset of the web data follows some kind of underlying structure. Nevertheless, HTML does not contain any schema or semantic information about the data it represents...