Many data are modeled as tensors, or multi dimensional arrays. Examples include the predicates (subject, verb, object) in knowledge bases, hyperlinks and anchor texts in the Web g...
U. Kang, Evangelos E. Papalexakis, Abhay Harpale, ...
In this paper we propose to define a measure of visual similarity to compare different pages in a corpus. This measure is based on the analysis of the visual layout saliency of th...
We present a document analysis system able to assign logical labels and extract the reading order in a broad set of documents. All information sources, from geometric features and ...
CT This paper explores several methods for visualizing the thematic content of large document collections. As opposed to traditional query-driven document retrieval, these methods ...
Nancy Miller, Elizabeth G. Hetzler, Grant Nakamura...
Physical and logical structure recovering from electronic documents is still an open issue. In this paper, we propose a flexible and efficient approach for recovering document str...