This work concerns automatic topic segmentation of email conversations. We present a corpus of email threads manually annotated with topics, and evaluate annotator reliability. To...
Shafiq R. Joty, Giuseppe Carenini, Gabriel Murray,...
clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a ...
David M. Blei, Thomas L. Griffiths, Michael I. Jor...
In this paper, the task of text segmentation is approached from a topic modeling perspective. We investigate the use of latent Dirichlet allocation (LDA) topic model to segment a ...
Developing better methods for segmenting continuous text into words is important for improving the processing of Asian languages, and may shed light on how humans learn to segment...
Sharon Goldwater, Thomas L. Griffiths, Mark Johnso...
Time-series segmentation in the fully unsupervised scenario in which the number of segment-types is a priori unknown is a fundamental problem in many applications. We propose a Ba...