Sciweavers

73 search results - page 12 / 15
» Image-Based Document Vectors for Text Retrieval
Sort
View
85
Voted
KDD
2009
ACM
156views Data Mining» more  KDD 2009»
15 years 10 months ago
Effective multi-label active learning for text classification
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical ...
Bishan Yang, Jian-Tao Sun, Tengjiao Wang, Zheng Ch...
119
Voted
CIKM
2008
Springer
14 years 11 months ago
Identifying table boundaries in digital documents via sparse line detection
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
Ying Liu, Prasenjit Mitra, C. Lee Giles
SIGIR
2010
ACM
15 years 1 months ago
Self-taught hashing for fast similarity search
The ability of fast similarity search at large scale is of great importance to many Information Retrieval (IR) applications. A promising way to accelerate similarity search is sem...
Dell Zhang, Jun Wang, Deng Cai, Jinsong Lu
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
15 years 10 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee
94
Voted
SPIRE
2001
Springer
15 years 2 months ago
Distributed Query Processing Using Partitioned Inverted Files
In this paper, we study query processing in a distributed text database. The novelty is a real distributed architecture implementation that offers concurrent query service. The di...
Claudine Santos Badue, Ricardo A. Baeza-Yates, Ber...