Models of latent document semantics such as the mixture of multinomials model and Latent Dirichlet Allocation have received substantial attention for their ability to discover top...
Daniel David Walker, William B. Lund, Eric K. Ring...
A novel text extraction method from graphical document images is presented in this paper. Graphical document images containing text and graphics components are considered as two-d...
Abstract. Fine-grained lock protocols with lock modes and lock granules adjusted to the various XML processing models, allow for highly concurrent transaction processing on XML tre...
Perspective distortion always occurs while scanning thick, bound documents. This distortion mainly causes two sources of degradation for the scanned grayscale image ? i) shade alo...
We present a new image compression technique called DjVu" that is speci cally geared towards the compression of scanned documents in color at high resolution. With DjVu, a ma...