Sciweavers

28 search results - page 3 / 6
» Training Global Linear Models for Chinese Word Segmentation
Sort
View
EMNLP
2010
13 years 3 months ago
Enhancing Domain Portability of Chinese Segmentation Model Using Chi-Square Statistics and Bootstrapping
Almost all Chinese language processing tasks involve word segmentation of the language input as their first steps, thus robust and reliable segmentation techniques are always requ...
Baobao Chang, Dongxu Han
TALIP
2002
108views more  TALIP 2002»
13 years 5 months ago
Toward a unified approach to statistical language modeling for Chinese
This paper presents a unified approach to Chinese statistical language modeling (SLM). Applying SLM techniques like trigram language models to Chinese is challenging because (1) t...
Jianfeng Gao, Joshua Goodman, Mingjing Li, Kai-Fu ...
EMNLP
2004
13 years 6 months ago
A New Approach for English-Chinese Named Entity Alignment
Traditional word alignment approaches cannot come up with satisfactory results for Named Entities. In this paper, we propose a novel approach using a maximum entropy model for nam...
Donghui Feng, Yajuan Lü, Ming Zhou
ACL
2008
13 years 6 months ago
Joint Word Segmentation and POS Tagging Using a Single Perceptron
For Chinese POS tagging, word segmentation is a preliminary step. To avoid error propagation and improve segmentation by utilizing POS information, segmentation and tagging can be...
Yue Zhang 0004, Stephen Clark
LREC
2010
188views Education» more  LREC 2010»
13 years 6 months ago
How Large a Corpus Do We Need: Statistical Method Versus Rule-based Method
We investigate the impact of input data scale in corpus-based learning using a study style of Zipf's law. In our research, Chinese word segmentation is chosen as the study ca...
Hai Zhao, Yan Song, Chunyu Kit