In this paper, we present a hybrid method for word segmentation and POS tagging. The target languages are those in which word boundaries are ambiguous, such as Chinese and Japanes...
This paper examines how one can obtain state of the art Chinese word segmentation using global linear models. We provide experimental comparisons that give a detailed road-map for ...
Abstract. We propose a new unsupervised training method for acquiring probability models that accurately segment Chinese character sequences into words. By constructing a core lexi...
We propose a self-supervised word-segmentation technique for Chinese information retrieval. This method combines the advantages of traditional dictionary based approaches with cha...
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick ...
We proposed a subword-based tagging for Chinese word segmentation to improve the existing character-based tagging. The subword-based tagging was implemented using the maximum entr...