Search Sciweavers | Sciweavers

14 search results - page 2 / 3

» Can chinese web pages be classified with english data source

click to vote

LREC
2010

172views Education» more LREC 2010»

Evaluating Utility of Data Sources in a Large Parallel Czech-English Corpus CzEng 0.9

13 years 6 months ago

Download www.lrec-conf.org

CzEng 0.9 is the third release of a large parallel corpus of Czech and English. For the current release, CzEng was extended by significant amount of texts from various types of so...

Ondrej Bojar, Adam Liska, Zdenek Zabokrtský

claim paper

Read More »

click to vote

LREC
2008

108views Education» more LREC 2008»

A Lightweight and Efficient Tool for Cleaning Web Pages

13 years 6 months ago

Download www.lrec-conf.org

Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...

Stefan Evert

claim paper

Read More »

click to vote

ACL
2008

160views Computational Linguistics» more ACL 2008»

Mining Parenthetical Translations from the Web by Word Alignment

13 years 6 months ago

Download www.aclweb.org

Documents in languages such as Chinese, Japanese and Korean sometimes annotate terms with their translations in English inside a pair of parentheses. We present a method to extrac...

Dekang Lin, Shaojun Zhao, Benjamin Van Durme, Mari...

claim paper

Read More »

click to vote

IRI
2008
IEEE

168views Information Technology» more IRI 2008»

Curate a transliteration corpus from transliteration/translation pairs

13 years 11 months ago

Download wil.csie.cyut.edu.tw

Transliteration of new named entity is important for information retrieval that crosses two or multiple language. Rule-based machine transliteration is not satisfactory, since dif...

Shih-Hung Wu, Yu-Te Li

claim paper

Read More »

click to vote

CIKM
2008
Springer

160views Information Technology» more CIKM 2008»

Cross-lingual query classification: a preliminary study

13 years 7 months ago

Download www.cs.umass.edu

The non-English Web is growing at breakneck speed, but available language processing tools are mostly English based. Taxonomies are a case in point: while there are plenty of comm...

Xuerui Wang, Andrei Z. Broder, Evgeniy Gabrilovich...

claim paper

Read More »

« Prev « First page 2 / 3 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers