Latent Dirichlet allocation is a fully generative statistical language model that has been proven to be successful in capturing both the content and the topics of a corpus of docum...
The explosion in the amount of free text materials on the Internet, and the use of this information by people from all walks of life, has made the issue of generalized information...
The problem of the resolution of the lexical ambiguity, which is commonly referred as Word Sense Disambiguation (WSD), seems to be stuck because of the knowledge acquisition bottle...
Research on linear text segmentation has been an on-going focus in NLP for the last decade, and it has great potential for a wide range of applications such as document summarizati...
Jingbo Zhu, Na Ye, Xinzhi Chang, Wenliang Chen, Be...
XML is an emerging standard for data representation and exchange on the World-Wide Web. Due to the nature of information on the Web and the inherent flexibility of XML, we expect...