One of the central problems in building broad-coverage story understanding systems is generating expectations about event sequences, i.e. predicting what happens next given some a...
Background: Manual curation of biological databases, an expensive and labor-intensive process, is essential for high quality integrated data. In this paper we report the implement...
We present a semi-Markov model for recognizing scene text that integrates character and word segmentation with recognition. Using wavelet features, it requires only approximate lo...
Allen R. Hanson, Erik G. Learned-Miller, Jerod J. ...
Clustering layouts of software systems combine two important aspects: they reveal groups of related artifacts of the software system, and they produce a visualization of the resul...
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...