Search Sciweavers | Sciweavers

31 search results - page 1 / 7

» BlogBuster: A Tool for Extracting Corpora from the Blogosphe...

click to vote

LREC
2010

216views Education» more LREC 2010»

BlogBuster: A Tool for Extracting Corpora from the Blogosphere

13 years 6 months ago

Download www.lrec-conf.org

This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...

Georgios Petasis, Dimitrios Petasis

claim paper

Read More »

click to vote

ACL
2012

218views Computational Linguistics» more ACL 2012»

ACCURAT Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora

11 years 7 months ago

Download aclweb.org

The lack of parallel corpora and linguistic resources for many languages and domains is one of the major obstacles for the further advancement of automated translation. A possible...

Marcis Pinnis, Radu Ion, Dan Stefanescu, Fangzhong...

claim paper

Read More »

click to vote

CICLING
2009
Springer

151views Natural Language Processing» more CICLING 2009»

Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation

14 years 5 months ago

Download tlt07.uib.no

We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the ...

John Tinsley, Mary Hearne, Andy Way

claim paper

Read More »

click to vote

WWW
2003
ACM

149views Internet Technology» more WWW 2003»

Annotating Web pages for the needs of Web Information Extraction Applications

14 years 5 months ago

Download cgi.di.uoa.gr

This paper outlines our approach to the creation of annotated corpora for the purposes of Web Information Extraction, and presents the Web Annotation tool. This tool enables the a...

Georgios Sigletos, Dimitra Farmakiotou, Konstantin...

claim paper

Read More »

click to vote

COLING
2010

191views Computational Linguistics» more COLING 2010»

Mining Large-scale Comparable Corpora from Chinese-English News Collections

12 years 12 months ago

Download www.aclweb.org

In this paper, we explore a CLIR-based approach to construct large-scale Chinese-English comparable corpora, which is valuable for translation knowledge mining. The initial source...

Degen Huang, Lian Zhao, Lishuang Li, Haitao Yu

claim paper

Read More »

« Prev « First page 1 / 7 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers