Word Sense Disambiguation (WSD) often relies on a context model or vector constructed from the words that co-occur with the target word within the same text windows. In most cases...
Bernard Brosseau-Villeneuve, Jian-Yun Nie, Noriko ...
Ambiguity of entity mentions and concept references is a challenge to mining text beyond surface-level keywords. We describe an effective method of disambiguating surface forms an...
Yiping Zhou, Lan Nie, Omid Rouhani-Kalleh, Flavian...
— Consider a situation in which the transmission of encrypted message is intercepted by an adversary who can later ask the sender to reveal the random choices (and also the secre...
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian m...
Extracting sentences that contain important information from a document is a form of text summarization. The technique is the key to the automatic generation of summaries similar ...