In this paper, we adapt a statistical learning approach, inspired by automated topic segmentation techniques in speech-recognized documents to the challenging protein segmentation ...
Betty Yee Man Cheng, Jaime G. Carbonell, Judith Kl...
Previously topic models such as PLSI (Probabilistic Latent Semantic Indexing) and LDA (Latent Dirichlet Allocation) were developed for modeling the contents of plain texts. Recent...
Many web documents refer to specific geographic localities and many people include geographic context in queries to web search engines. Standard web search engines treat the geogra...
Subodh Vaid, Christopher B. Jones, Hideo Joho, Mar...
In order to solve problems of reliability of systems based on lexical repetition and problems of adaptability of languagedependent systems, we present a context-based topic segmen...
We propose an anaphor resolution based opinion holder identification method exploiting lexical and syntactic information. We tested our approach on online news documents and obtai...