This paper describes Movie-DiC a Movie Dialogue Corpus recently collected for research and development purposes. The collected dataset comprises 132,229 dialogues containing a tot...
We describe the use of a hierarchical topic model for automatically identifying syntactic and lexical patterns that explicitly state ontological relations. We leverage distant sup...
We introduce a spectral learning algorithm for latent-variable PCFGs (Petrov et al., 2006). Under a separability (singular value) condition, we prove that the method provides cons...
Shay B. Cohen, Karl Stratos, Michael Collins, Dean...
We propose the first joint model for word segmentation, POS tagging, and dependency parsing for Chinese. Based on an extension of the incremental joint model for POS tagging and ...
Jun Hatori, Takuya Matsuzaki, Yusuke Miyao, Jun-ic...
We present a novel approach for building verb subcategorization lexicons using a simple graphical model. In contrast to previous methods, we show how the model can be trained with...
Thomas Lippincott, Anna Korhonen, Diarmuid Ó...
Although much work on relation extraction has aimed at obtaining static facts, many of the target relations are actually fluents, as their validity is naturally anchored to a cer...
In this paper we introduce the novel task of “word epoch disambiguation,” defined as the problem of identifying changes in word usage over time. Through experiments run using...
Syntactic analysis of search queries is important for a variety of information-retrieval tasks; however, the lack of annotated data makes training query analysis models difficult...
Kuzman Ganchev, Keith Hall, Ryan T. McDonald, Slav...
Most information extraction (IE) systems identify facts that are explicitly stated in text. However, in natural language, some facts are implicit, and identifying them requires ...