Enriching speech recognition output with sentence boundaries improves its human readability and enables further processing by downstream language processing modules. We have const...
Yang Liu, Nitesh V. Chawla, Mary P. Harper, Elizab...
Machine learning techniques such as tree induction have become accepted tools for developing generalisations of large data sets, typically for use with production rule systems in p...
In this paper we study the problem of collecting training samples for building enterprise taxonomies. We develop a computer-aided tool named InfoAnalyzer, which can effectively as...
The LinGO Redwoods initiative is a seed activity in the design and development of a new type of treebank. While several medium- to large-scale treebanks exist for English (and for...
Stephan Oepen, Kristina Toutanova, Stuart M. Shieb...
Background: Extracting Protein-Protein Interactions (PPI) from research papers is a way of translating information from English to the language used by the databases that store th...