This paper describes a novel Bayesian approach to unsupervised topic segmentation. Unsupervised systems for this task are driven by lexical cohesion: the tendency of wellformed se...
In this paper, we report our experiments on the Web Track TREC-2003. We submitted five runs for the topic distillation task. Our goal was to evaluate the standard language modeli...
With the proliferation of user-generated articles over the web, it becomes imperative to develop automated methods that are aware of the ideological-bias implicit in a document co...
Both Topic Maps and RDF are popular semantic web standards designed for machine processing of web documents. Since these representations were originally created for different purpo...
This paper studies the problem of discovering and comparing geographical topics from GPS-associated documents. GPSassociated documents become popular with the pervasiveness of loc...