The retrieval performance of an information retrieval system usually increases when it uses the relationships among the terms contained in a given document collection. However, th...
Events and stories can be characterized by a set of descriptive, collocated keywords. Intuitively, documents describing the same event will contain similar sets of keywords, and t...
Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify te...
The latent topic model plays an important role in the unsupervised learning from a corpus, which provides a probabilistic interpretation of the corpus in terms of the latent topic...
Annotating training data for event extraction is tedious and labor-intensive. Most current event extraction tasks rely on hundreds of annotated documents, but this is often not en...