Document assembly software is a technology that is fundamental to disrupting law firms. This article uses the framework set out by Clayton Christensen in The Innovator’s Dilemma...
We propose a new method of classifying documents into categories. We define for each category a finite mixture model based on soft clustering of words. We treat the problem of cla...
We address document image classification by visual appearance. An image is represented by a variable-length list of visually salient features. A hierarchical Bayesian network is ...
A staggering number of multimedia applications are being introduced every day. Yet, the inordinate delays encountered in retrieving multimedia documents make it difficult to use t...
The presence of replicas or near-replicas of documents is very common on the Web. Documents may be replicated completely or partially for different reasons (versions, mirrors, etc...
Ernesto Di Iorio, Michelangelo Diligenti, Marco Go...