The rapid popularization of digital cameras and mobile phone cameras has lead to an explosive growth of consumer photo collections. In this paper, we present a (quasi) real-time t...
Background: High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This sc...
Recent research in domain-independent information extraction holds the promise of an automatically-constructed structured database derived from the Web. A query system based on th...
In the ocean of Web data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized sys...
Ricardo A. Baeza-Yates, Carlos Castillo, Flavio Ju...
Hypothesis generation is a crucial initial step for making scientific discoveries. This paper addresses the problem of automatically discovering interesting hypotheses from the we...