Sciweavers

NAACL
1994

Japanese Word Segmentation by Hidden Markov Model

13 years 4 months ago
Japanese Word Segmentation by Hidden Markov Model
The processing of Japanese text is complicated by the fact that there are no word delimiters. To segment Japanese text, systems typically use knowledge-based methods and large lexicons. This paper presents a novel approach to Japanese word segmentation which avoids the need for Japanese word lexicons and explicit rule bases. The algorithm utilizes a hidden Markov model, a stochastic process, to determine word boundaries. This method has achieved 91% accuracy in segmenting words in a test corpus.
Constantine Papageorgiou
Added 02 Nov 2010
Updated 02 Nov 2010
Type Conference
Year 1994
Where NAACL
Authors Constantine Papageorgiou
Comments (0)