Sciweavers

249 search results - page 4 / 50
» Classification of Documents Based on the Structure of Their ...
Sort
View
WEBDB
2005
Springer
97views Database» more  WEBDB 2005»
13 years 10 months ago
Towards a Query Language for Multihierarchical XML: Revisiting XPath
In recent years it has been argued that when XML encodings become complex, DOM trees are no longer adequate for query processing. Alternative representations of XML documents, suc...
Ionut Emil Iacob, Alex Dekhtyar
CPM
2004
Springer
144views Combinatorics» more  CPM 2004»
13 years 10 months ago
A Simple Optimal Representation for Balanced Parentheses
We consider succinct, or highly space-efficient, representations of a (static) string consisting of n pairs of balanced parentheses, that support natural operations such as findi...
Richard F. Geary, Naila Rahman, Rajeev Raman, Venk...
ML
2006
ACM
132views Machine Learning» more  ML 2006»
13 years 5 months ago
A suffix tree approach to anti-spam email filtering
We present an approach to email filtering based on the suffix tree data structure. A method for the scoring of emails using the suffix tree is developed and a number of scoring and...
Rajesh Pampapathi, Boris Mirkin, Mark Levene
JIIS
2000
120views more  JIIS 2000»
13 years 5 months ago
Machine Learning for Intelligent Processing of Printed Documents
Abstract. A paper document processing system is an information system component which transforms information on printed or handwritten documents into a computer-revisable form. In ...
Floriana Esposito, Donato Malerba, Francesca A. Li...
AUSDM
2008
Springer
243views Data Mining» more  AUSDM 2008»
13 years 7 months ago
Structure-Based Document Model with Discrete Wavelet Transforms and Its Application to Document Classification
Term signal is an existing text representation that depicts a term as a vector of frequencies of occurrences in a number of user-defined partitions of a document. Although term si...
Supphachai Thaicharoen, Tom Altman, Krzysztof J. C...