Abstract. To represent and manage data mining patterns, several aspects have to be taken into account: (i) patterns are heterogeneous in nature; (ii) patterns can be extracted from...
Barbara Catania, Anna Maddalena, Maurizio Mazza, E...
Hyperlink recommendation overcomes the problem of quick and easy access to information in web systems. A method that integrates web usage and content mining was proposed and examin...
A hierarchical framework for document segmentation is proposed as an optimization problem. The model incorporates the dependencies between various levels of the hierarchy unlike tr...
K. S. Sesh Kumar, Anoop M. Namboodiri, C. V. Jawah...
This paper presents a MapReduce algorithm for computing pairwise document similarity in large document collections. MapReduce is an attractive framework because it allows us to de...
Abstract. In this paper, we propose a novel technique for automatic table detection in document images. Lines and tables are among the most frequent graphic, non-textual entities i...
Basilios Gatos, Dimitrios Danatsas, Ioannis Pratik...