Sciweavers

15 search results - page 2 / 3
» Weighted Set-Based String Similarity
Sort
View
SIGMOD
2010
ACM
174views Database» more  SIGMOD 2010»
13 years 10 months ago
Sampling dirty data for matching attributes
We investigate the problem of creating and analyzing samples of relational databases to find relationships between string-valued attributes. Our focus is on identifying attribute...
Henning Köhler, Xiaofang Zhou, Shazia Wasim S...
EMNLP
2004
13 years 6 months ago
Evaluating Information Content by Factoid Analysis: Human annotation and stability
We present a new approach to intrinsic summary evaluation, based on initial experiments in van Halteren and Teufel (2003), which combines two novel aspects: comparison of informat...
Simone Teufel, Hans van Halteren
PAKDD
2009
ACM
263views Data Mining» more  PAKDD 2009»
13 years 12 months ago
Spatial Weighting for Bag-of-Visual-Words and Its Application in Content-Based Image Retrieval
It is a challenging and important task to retrieve images from a large and highly varied image data set based on their visual contents. Problems like how to fill the semantic gap b...
Xin Chen, Xiaohua Hu, Xiajiong Shen
SIGIR
2005
ACM
13 years 10 months ago
Web-based acquisition of Japanese katakana variants
This paper describes a method of detecting Japanese Katakana variants from a large corpus. Katakana words, which are mainly used as loanwords, cause problems with information retr...
Takeshi Masuyama, Hiroshi Nakagawa
ICML
2004
IEEE
14 years 6 months ago
Distribution kernels based on moments of counts
Many applications in text and speech processing require the analysis of distributions of variable-length sequences. We recently introduced a general kernel framework, rational ker...
Corinna Cortes, Mehryar Mohri