Sciweavers

79 search results - page 14 / 16
» Self-Supervised Chinese Word Segmentation
Sort
View
ACL
1998
14 years 11 months ago
Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model
We present a novel OCR error correction method for languages without word delimiters that have a large character set, such as Japanese and Chinese. It consists of a statistical OC...
Masaaki Nagata
COLING
2002
14 years 9 months ago
An Agent-based Approach to Chinese Named Entity Recognition
Chinese NE (Named Entity) recognition is a difficult problem because of the uncertainty in word segmentation and flexibility in language structure. This paper proposes the use of ...
Shiren Ye, Tat-Seng Chua, Jimin Liu
CORR
2002
Springer
90views Education» more  CORR 2002»
14 years 9 months ago
Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences
Given the lack of word delimiters in written Japanese, word segmentation is generally considered a crucial first step in processing Japanese texts. Typical Japanese segmentation a...
Rie Kubota Ando, Lillian Lee
AI
2001
Springer
15 years 2 months ago
A Statistical Corpus-Based Term Extractor
Abstract. Term extraction is an important problem in natural language processing. In this paper, we propose a language independent statistical corpus-based term extraction algorith...
Patrick Pantel, Dekang Lin
ICCPOL
2009
Springer
14 years 7 months ago
A Simple and Efficient Model Pruning Method for Conditional Random Fields
Conditional random fields (CRFs) have been quite successful in various machine learning tasks. However, as larger and larger data become acceptable for the current computational ma...
Hai Zhao, Chunyu Kit