Abstract. Topic models are a discrete analogue to principle component analysis and independent component analysis that model topic at the word level within a document. They have ma...
Web Services (WSs) are the W3C-endorsed realization of the Service-Oriented Architecture (SOA). Since they are supposed to be implementation-neutral, WSs are typically tested blac...
The rapid growth of XML adoption has urged for the need of a proper representation for semi-structured documents, where the document structural information has to be taken into ac...
A multilevel semantic document classification system based on Support Vector Machine (SVM) in association with domain ontologies has been developed. The documents related to the s...
We propose a fast and robust skew estimation method for scanned documents that estimates skew angles based on piecewise covering of objects, such as textlines, figures, forms, or...