This paper explores topic aspect (i.e., subtopic or facet) classification for English and Chinese collections. The evaluation model assumes a bilingual user who has found document...
Most topic models, such as latent Dirichlet allocation, rely on the bag-of-words assumption. However, word order and phrases are often critical to capturing the meaning of text in...
Web pages, like people, are often known by others in a variety of contexts. When those contexts are sufficiently distinct, a page's importance may be better represented by mu...
In traditional Information Retrieval (IR), user profiles are often represented by keyword/concepts space vectors or by some predefined categories. Unfortunately, this data is often...
Ambiguity of entity mentions and concept references is a challenge to mining text beyond surface-level keywords. We describe an effective method of disambiguating surface forms an...
Yiping Zhou, Lan Nie, Omid Rouhani-Kalleh, Flavian...