A major difficulty for designing a document image segmentation methodology is the proper value selection for all involved parameters. This is usually done after experimentations o...
We present a new and efficient semi-supervised training method for parameter estimation and feature selection in conditional random fields (CRFs). In real-world applications suc...
This paper describes our participation in the 2008 TREC Blog track. Our system consists of 3 components: data preprocessing, topic retrieval, and opinion finding. In the topic ret...
This paper is concerned with the problem of definition search. Specifically, given a term, we are to retrieve definitional excerpts of the term and rank the extracted excerpts acc...
This paper presents a fully automatic framework for the restoration of double-sided historical manuscripts which are impaired by ink bleed-through distortions. First, the recto si...