We study a novel shallow information extraction problem that involves extracting sentences of a given set of topic categories from medical forum data. Given a corpus of medical fo...
We present a novel approach to parse web search queries for the purpose of automatic tagging of the queries. We will define a set of probabilistic context-free rules, which genera...
We describe ParsCit, a freely available, open-source implementation of a reference string parsing package. At the core of ParsCit is a trained conditional random field (CRF) model...
Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treeba...
The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if referenc...