Sciweavers

384 search results - page 6 / 77
» INTEX: A Corpus Processing System
Sort
View
WWW
2010
ACM
15 years 4 months ago
TWC data-gov corpus: incrementally generating linked government data from data.gov
The Open Government Directive is making US government data available via websites such as Data.gov for public access. In this paper, we present a Semantic Web based approach that ...
Li Ding, Dominic DiFranzo, Alvaro Graves, James Mi...
IAJIT
2011
14 years 4 months ago
Improving the accuracy of English-Arabic statistical sentence alignment
: Multilingual natural language processing systems are increasingly relying on parallel corpus to ameliorate their output. Parallel corpora constitute the basic block for training ...
Mohammad Salameh, Rached Zantout, Nashat Mansour
KDD
2007
ACM
186views Data Mining» more  KDD 2007»
15 years 10 months ago
Content-based document routing and index partitioning for scalable similarity-based searches in a large corpus
We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...
Deepavali Bhagwat, Kave Eshghi, Pankaj Mehra
NLDB
2005
Springer
15 years 3 months ago
Automatic Filtering of Bilingual Corpora for Statistical Machine Translation
Abstract. For many applications such as machine translation and bilingual information retrieval, the bilingual corpora play an important role in training the system. Because they a...
Shahram Khadivi, Hermann Ney
COLING
1996
14 years 11 months ago
Using a Hybrid System of Corpus- and Knowledge-Based Techniques to Automate the Induction of a Lexical Sublanguage Grammar
Porting a Natural Language Processing (NLP) system to a new domain remains one of the bottlenecks in syntactic parsing, because of the amount of effort required to fix gaps in the...
Geert Jan Wilms