Buffering of intermediate results in dataflow diagrams can significantly reduce latency when a user browses these results or re-executes a diagram with slightly different inputs. ...
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...
Search engine logs are an emerging new type of data that offers interesting opportunities for data mining. Existing work on mining such data has mostly attempted to discover knowl...
Finding biological entities (such as genes or proteins) that satisfy certain conditions from texts is an important and challenging task in biomedical information retrieval and tex...
We propose a dynamic faceted search system for discoverydriven analysis on data with both textual content and structured attributes. From a keyword query, we want to dynamically s...
Debabrata Dash, Jun Rao, Nimrod Megiddo, Anastasia...