Processing and extracting meaningful knowledge from count data is an important problem in data mining. The volume of data is increasing dramatically as the data is generated by da...
We investigate and compare two forms of recursion on sets for querying nested collections. The rst one is called sri and it corresponds to sequential processing of data. The second...
Tweets are the most up-to-date and inclusive stream of information and commentary on current events, but they are also fragmented and noisy, motivating the need for systems that c...
Performance analysis tools are critical for the effective use of large parallel computing resources, but existing tools have failed to address three problems that limit their scal...
In this demo, we will present Tiresias, the first how-to query engine. How-to queries represent fundamental data analysis questions of the form: “How should the input change in...