We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are incorporated by adding ad-hoc rules to a working grammar; subseque...
Abstract. Spoken audio is an important source of information available to knowledge extraction and management systems. Organization of spoken messages by priority and content can f...
This paper deals with an acronym/definition extraction approach from textual data (corpora) and the disambiguation of these definitions (or expansions). Both steps of our global pr...
An investment of effort over the last two years has begun to produce a wealth of data concerning computational psycholinguistic models of syntax acquisition. The data is generated...
Concepts are sequences of words that represent real or imaginary entities or ideas that users are interested in. As a first step towards building a web of concepts that will form...
Aditya G. Parameswaran, Hector Garcia-Molina, Anan...