Performance of n-gram language models depends to a large extent on the amount of training text material available for building the models and the degree to which this text matches...
Abstract. Discourse segmentation is the division of a text into minimal discourse segments, which form the leaves in the trees that are used to represent discourse structures. A de...
Abstract. In this paper, we describe a new way to access information by “chatting” to an information source. This involves a chatbot, a program that emulates human conversation...
We introduce a new set of tools for working with web-scale N-gram data. These tools lower the barrier for working with web-scale text, and create a new platform for acquiring larg...
Dekang Lin, Kenneth Ward Church, Heng Ji, Satoshi ...
In this paper, we present a new approximation in Natural Language Processing (nlp) aimed at knowledge representation and acquisition using a formal syntactic frame. In practice, we...