Sciweavers

EMNLP
2004

Applying Conditional Random Fields to Japanese Morphological Analysis

13 years 5 months ago
Applying Conditional Random Fields to Japanese Morphological Analysis
This paper presents Japanese morphological analysis based on conditional random fields (CRFs). Previous work in CRFs assumed that observation sequence (word) boundaries were fixed. However, word boundaries are not clear in Japanese, and hence a straightforward application of CRFs is not possible. We show how CRFs can be applied to situations where word boundary ambiguity exists. CRFs offer a solution to the long-standing problems in corpus-based or statistical Japanese morphological analysis. First, flexible feature designs for hierarchical tagsets become possible. Second, influences of label and length bias are minimized. We experiment CRFs on the standard testbed corpus used for Japanese morphological analysis, and evaluate our results using the same experimental dataset as the HMMs and MEMMs previously reported in this task. Our results confirm that CRFs not only solve the long-standing problems but also improve the performance over HMMs and MEMMs.
Taku Kudo, Kaoru Yamamoto, Yuji Matsumoto
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2004
Where EMNLP
Authors Taku Kudo, Kaoru Yamamoto, Yuji Matsumoto
Comments (0)