This paper describes a method for asking statistical questions about a large text corpus. We exemplify the method by addressing the question, "What percentage of Federal Regi...
Many researchers have attempted to predict the Enron corporate hierarchy from the data. This work, however, has been hampered by a lack of data. We present a new, large, and freel...
In this paper, we will describe a search tool for a huge set of ngrams. The tool supports queries with an arbitrary number of wildcards. It takes a fraction of a second for a sear...
Today, management and tuning questions are approached using if...then... rules of thumb. This reactive approach requires expertise regarding of system behavior, making it difficu...
Eno Thereska, Dushyanth Narayanan, Gregory R. Gang...
Case frames are an important knowledge base for a variety of natural language processing (NLP) systems. For the practical use of these systems in the real world, wide-coverage cas...