While Named Entity extraction is useful in many natural language applications, the coarse categories that most NE extractors work with prove insufficient for complex applications ...
We describe a discovery program, called UNIVAUTO (UNIVersals AUthoringTOol), whose domain of application is the study of language universals, a classic trend in contemporary lingu...
This paper investigates methods to automatically infer structural information from large XML documents. Using XML as a reference format, we approach the schema generation problem ...
In this paper, a supervised learning system of word sense disambiguation is presented. It is based on conditional maximum entropy models. This system acquires the linguistic knowl...
In some contexts, well-formed natural language cannot be expected as input to information or communication systems. In these contexts, the use of grammar-independent input (sequen...