Sciweavers

73 search results - page 12 / 15
» Image-Based Document Vectors for Text Retrieval
Sort
View
KDD
2009
ACM
156views Data Mining» more  KDD 2009»
14 years 6 months ago
Effective multi-label active learning for text classification
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical ...
Bishan Yang, Jian-Tao Sun, Tengjiao Wang, Zheng Ch...
CIKM
2008
Springer
13 years 8 months ago
Identifying table boundaries in digital documents via sparse line detection
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
Ying Liu, Prasenjit Mitra, C. Lee Giles
SIGIR
2010
ACM
13 years 10 months ago
Self-taught hashing for fast similarity search
The ability of fast similarity search at large scale is of great importance to many Information Retrieval (IR) applications. A promising way to accelerate similarity search is sem...
Dell Zhang, Jun Wang, Deng Cai, Jinsong Lu
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
14 years 6 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee
SPIRE
2001
Springer
13 years 10 months ago
Distributed Query Processing Using Partitioned Inverted Files
In this paper, we study query processing in a distributed text database. The novelty is a real distributed architecture implementation that offers concurrent query service. The di...
Claudine Santos Badue, Ricardo A. Baeza-Yates, Ber...