Sciweavers

6 search results - page 1 / 2
» Adaptive Parallel Sentences Mining from Web Bilingual News C...
Sort
View
ACL
2009
13 years 2 months ago
Mining Bilingual Data from the Web with Adaptively Learnt Patterns
Mining bilingual data (including bilingual sentences and terms1 ) from the Web can benefit many NLP applications, such as machine translation and cross language information retrie...
Long Jiang, Shiquan Yang, Ming Zhou, Xiaohua Liu, ...
PAKDD
2009
ACM
116views Data Mining» more  PAKDD 2009»
13 years 11 months ago
Scalable Web Mining with Newistic
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Ovidiu Dan, Horatiu Mocian
COLING
2010
12 years 11 months ago
An Empirical Study on Web Mining of Parallel Data
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
Gum-Won Hong, Chi-Ho Li, Ming Zhou, Hae-Chang Rim
WSDM
2009
ACM
138views Data Mining» more  WSDM 2009»
13 years 11 months ago
Integration of news content into web results
Aggregated search refers to the integration of content from specialized corpora or verticals into web search results. Aggregation improves search when the user has vertical intent...
Fernando Diaz