Two-dimensional and three-dimensional coordinate systems are the basic graphics symbols in many graphical documents. A robust coordinate system detection scheme is needed in order...
Documents and authors can be clustered into “knowledge communities” based on the overlap in the papers they cite. We introduce a new clustering algorithm, Streemer, which fin...
Vasileios Kandylas, S. Phineas Upham, Lyle H. Unga...
—Document images captured by a mobile phone camera often have perspective distortions. In this paper, fast and robust vanishing point detection methods for such perspective docum...
Xu-Cheng Yin, Hong-Wei Hao, Jun Sun 0004, Satoshi ...
We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...
The increasing use of large open-domain document sources is exacerbating the problem of ambiguity in named entities. This paper explores the use of a range of syntactic and semant...