Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

116

Voted

ACL
2012

favoriteEmaildiscussreport

199views Computational Linguistics» more ACL 2012»

Unsupervized Word Segmentation: the Case for Mandarin Chinese

13 years 3 months ago

Unsupervized Word Segmentation: the Case for Mandarin Chinese

Download aclweb.org

In this paper, we present an unsupervized segmentation system tested on Mandarin Chinese. Following Harris's Hypothesis in Kempe (1999) and Tanaka-Ishii's (2005) reformulation, we base our work on the Variation of Branching Entropy. We improve on (Jin and Tanaka-Ishii, 2006) by adding normalization and viterbidecoding. This enable us to remove most of the thresholds and parameters from their model and to reach near state-of-the-art results (Wang et al., 2011) with a simpler system. We provide evaluation on diﬀerent corpora available from the Segmentation bake-oﬀ II (Emerson, 2005) and deﬁne a more precise topline for the task using cross-trained supervized system available oﬀ-the-shelf (Zhang and Clark, 2010; Zhao and Kit, 2008; Huang and Zhao, 2007)

Pierre Magistry, Benoît Sagot

Real-time Traffic

ACL 2012 | Computational Linguistics | Ishii | Mandarin Chinese | Segmentation System |

claim paper

Related Content

» Issues in pre and posttranslation document expansion untranslatable cognates and missegmen...

» Unsupervised phonemic Chinese word segmentation using Adaptor Grammars

» A Stochastic FiniteState WordSegmentation Algorithm for Chinese

» Discriminating capabilities of syllablebased features and approaches of utilizing them for...

» Automatic Adaptation of Annotation Standards Chinese Word Segmentation and POS Tagging A ...

» Applying an NVEF WordPair Identifier to the Chinese SyllabletoWord Conversion Problem

» Fast Online Training with FrequencyAdaptive Learning Rates for Chinese Word Segmentation a...

» How Large a Corpus Do We Need Statistical Method Versus Rulebased Method

» Transliteration of proper names in crosslanguage applications

Post Info
More Details (n/a)

Added	29 Sep 2012
Updated	29 Sep 2012
Type	Journal
Year	2012
Where	ACL
Authors	Pierre Magistry, Benoît Sagot

Comments (0)