Sciweavers

180 search results - page 2 / 36
» A Study on Multi-word Extraction from Chinese Documents
Sort
View
COLING
2010
13 years 9 days ago
Mining Large-scale Comparable Corpora from Chinese-English News Collections
In this paper, we explore a CLIR-based approach to construct large-scale Chinese-English comparable corpora, which is valuable for translation knowledge mining. The initial source...
Degen Huang, Lian Zhao, Lishuang Li, Haitao Yu
ACL
2006
13 years 6 months ago
An Empirical Study of Chinese Chunking
In this paper, we describe an empirical study of Chinese chunking on a corpus, which is extracted from UPENN Chinese Treebank-4 (CTB4). First, we compare the performance of the st...
Wenliang Chen, Yujie Zhang, Hitoshi Isahara
PAMI
2002
94views more  PAMI 2002»
13 years 5 months ago
Imaged Document Text Retrieval Without OCR
: We propose a method for text retrieval from document images without the use of OCR. Documents are segmented into character objects. Image features, namely the Vertical Traverse D...
Chew Lim Tan, Weihua Huang, Zhaohui Yu, Yi Xu
NLPRS
2001
Springer
13 years 9 months ago
Automatic Corpus-Based Extraction of Chinese Legal Terms
This paper reports on a study involving the automatic extraction of Chinese legal terms. We used a word segmented corpus of Chinese court judgments to extract salient legal expres...
Oi Yee Kwong, Benjamin K. Tsou
ACL
2011
12 years 9 months ago
Rare Word Translation Extraction from Aligned Comparable Documents
We present a first known result of high precision rare word bilingual extraction from comparable corpora, using aligned comparable documents and supervised classification. We in...
Emmanuel Prochasson, Pascale Fung