We present in this paper a comparison between three segmentation systems for the Vietnamese language. Indeed, the majority of Vietnamese words is built by semantic composition fro...
Quang Thang Dinh, Hong Phuong Le, Thi Minh Huyen N...
An approach for segmenting handwritten text in a Pre-Hospital Care Report (PCR) is presented. Segmentation of lines and words in a PCR is extremely challenging due to the nature o...
Word segmentation is the first and obligatory task for every NLP. For inflectional languages like English, French, Dutch,.. their word boundaries are simply assumed to be whitespa...
In this paper we present a method for detecting the text genre quickly and easily following an approach originally proposed in authorship attribution studies which uses as style m...
Efstathios Stamatatos, Nikos Fakotakis, George K. ...
We propose a simple two-level hierarchical probability model for unsupervised word segmentation. By treating words as strings composed of morphemes/phonemes which are themselves c...