Search Sciweavers | Sciweavers

70 search results - page 14 / 14

» Using self-supervised word segmentation in Chinese informati...

click to vote

LREC
2010

179views Education» more LREC 2010»

A Context Sensitive Variant Dictionary for Supporting Variant Selection

13 years 7 months ago

Download www.lrec-conf.org

In Japanese, there are a large number of notational variants of words. This is because Japanese words are written in three kinds of characters: kanji (Chinese) characters, hiragar...

Aya Nishikawa, Ryo Nishimura, Yasuhiko Watanabe, Y...

claim paper

Read More »

click to vote

KDD
2006
ACM

179views Data Mining» more KDD 2006»

Extracting key-substring-group features for text classification

14 years 5 months ago

Download www.comp.nus.edu.sg

In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...

Dell Zhang, Wee Sun Lee

claim paper

Read More »

click to vote

KDD
2004
ACM

163views Data Mining» more KDD 2004»

Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methods

14 years 5 months ago

Download www.cs.cmu.edu

We consider the problem of improving named entity recognition (NER) systems by using external dictionaries--more specifically, the problem of extending state-of-the-art NER system...

William W. Cohen, Sunita Sarawagi

claim paper

Read More »

click to vote

MT
2007

158views more MT 2007»

Automatic extraction of translations from web-based bilingual materials

13 years 5 months ago

Download www.site.uottawa.ca

This paper describes the framework of the StatCan Daily Translation Extraction System (SDTES), a computer system that maps and compares webbased translation texts of Statistics Can...

Qibo Zhu, Diana Zaiu Inkpen, Ash Asudeh

claim paper

Read More »

click to vote

ICDAR
1997
IEEE

143views Document Analysis» more ICDAR 1997»

Representing OCRed documents in HTML

13 years 9 months ago

Download www.cedar.buffalo.edu

ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...

Tao Hong, Sargur N. Srihari

claim paper

Read More »

« Prev « First page 14 / 14 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers