The Mixed Raster Content (MRC) document compression standard (ITU T.44) specifies a multi-layer multi-resolution representation of a compound document. The model is very efficie...
Automatic document classification is an important step in organizing and mining documents. Information in documents is often conveyed using both text and images that complement ea...
Abstract. In this paper, we present an approach for classifying documents based on the notion of a semantic similarity and the effective representation of the content of the docume...
In this paper, we call the pattern classification problem that consists in assigning a category label to a long audio signal based on its semantic content as Generic Audio Documen...
Many text documents naturally have two kinds of labels. For example, we may label web pages from universities according to their categories, such as "student" or "fa...