Sciweavers

10 search results - page 2 / 2
» Constructing Word-Based Text Compression Algorithms
Sort
View
JMLR
2006
125views more  JMLR 2006»
13 years 5 months ago
Spam Filtering Using Statistical Data Compression Models
Spam filtering poses a special problem in text categorization, of which the defining characteristic is that filters face an active adversary, which constantly attempts to evade fi...
Andrej Bratko, Gordon V. Cormack, Bogdan Filipic, ...
AAAI
2008
13 years 7 months ago
An Effective and Robust Method for Short Text Classification
Classification of texts potentially containing a complex and specific terminology requires the use of learning methods that do not rely on extensive feature engineering. In this w...
Victoria Bobicev, Marina Sokolova
DCC
2010
IEEE
13 years 8 months ago
A Fast Compact Prefix Encoding for Pattern Matching in Limited Resources Devices
This paper improves the Tagged Suboptimal Codes (TSC) compression scheme in several ways. We show how to process the TSC as a universal code. We introduce the TSCk as a family of ...
S. Harrusi, Amir Averbuch, N. Rabin
CCP
2011
92views more  CCP 2011»
12 years 5 months ago
Backwards Search in Context Bound Text Transformations
—The Burrows-Wheeler Transform (BWT) is the basis for many of the most effective compression and selfindexing methods used today. A key to the versatility of the BWT is the abili...
Matthias Petri, Gonzalo Navarro, J. Shane Culpeppe...
SIGIR
2008
ACM
13 years 5 months ago
Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization
Multi-document summarization aims to create a compressed summary while retaining the main characteristics of the original set of documents. Many approaches use statistics and mach...
Dingding Wang, Tao Li, Shenghuo Zhu, Chris H. Q. D...