Sciweavers

25 search results - page 5 / 5
» Effects of Term Segmentation on Chinese English Cross-Langua...
Sort
View
KDD
2005
ACM
185views Data Mining» more  KDD 2005»
14 years 6 months ago
Mining comparable bilingual text corpora for cross-language information integration
Integrating information in multiple natural languages is a challenging task that often requires manually created linguistic resources such as a bilingual dictionary or examples of...
Tao Tao, ChengXiang Zhai
SIGIR
2010
ACM
13 years 1 months ago
Efficient partial-duplicate detection based on sequence matching
With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...
Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang
WWW
2011
ACM
13 years 1 months ago
Unsupervised query segmentation using only query logs
We introduce an unsupervised query segmentation scheme that uses query logs as the only resource and can effectively capture the structural units in queries. We believe that Web s...
Nikita Mishra, Rishiraj Saha Roy, Niloy Ganguly, S...
KDD
2007
ACM
176views Data Mining» more  KDD 2007»
14 years 6 months ago
Mining correlated bursty topic patterns from coordinated text streams
Previous work on text mining has almost exclusively focused on a single stream. However, we often have available multiple text streams indexed by the same set of time points (call...
Xuanhui Wang, ChengXiang Zhai, Xiao Hu, Richard Sp...
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
14 years 6 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee