We present a new method for blind document bleed through removal based on separate Markov Random Field (MRF) regularization for the recto and for the verso side, where separate pri...
In this paper, we describe how meta-data of indexation can be extracted from historical document images using an interactive process with a software called AGORA. The algorithms i...
Abstract. Software analysis techniques, and in particular software “design recovery”, have been highly successful at both technical and businesslevel semantic markup of large s...
Nadzeya Kiyavitskaya, Nicola Zeni, James R. Cordy,...
RankBoost is a recently proposed algorithm for learning ranking functions. It is simple to implement and has strong justifications from computational learning theory. We describe...
Raj D. Iyer, David D. Lewis, Robert E. Schapire, Y...
Human-quality text summarization systems are di cult to design, and even more di cult to evaluate, in part because documents can di er along several dimensions, such as length, wri...
Jade Goldstein, Mark Kantrowitz, Vibhu O. Mittal, ...