The problem addressed in this paper is to segment a given multilingual document into segments for each language and then identify the language of each segment. The problem was mot...
We present two methods for unsupervised segmentation of words into morphemelike units. The model utilized is especially suited for languages with a rich morphology, such as Finnis...
Sequence data are abundant in application areas such as computational biology, environmental sciences, and telecommunications. Many real-life sequences have a strong segmental str...
In this paper we introduce a dynamic programming algorithm to perform linear text segmentation by global minimization of a segmentation cost function which consists of: (a) within...
In this paper, we propose a novel unsupervised approach to query segmentation, an important task in Web search. We use a generative query model to recover a query's underlyin...