Abstract. This paper presents an approach to automatically subject index fulltext documents with multiple labels based on binary support vector machines (SVM). The aim was to test ...
Abstract. Feature extraction based on evolutionary search offers new possibilities for improving classification accuracy and reducing measurement complexity in many data mining and...
Abstract. This paper describes and compares the use of methods based on Ngrams (specifically trigrams and pentagrams), together with five features, to recognise the syntactic and s...
Abstract. An approach is presented to guide the benchmarking of invoice analysis systems, a specific, applied subclass of document analysis systems. The state of the art of benchma...
The accurate tracking and retrieval of content pedigree is a quickly growing requirement as our abilities to create information assets increases exponentially. Plagiarism detection...