: A major problem that arises from integrating different databases is the existence of duplicates. Data cleaning is the process for identifying two or more records within the datab...
Current Data Mining techniques usually do not have a mechanism to automatically infer semantic features inherent in the data being “mined”. The semantics are either injected i...
In this paper we present HyperJournal, an Open Source web application for publishing on-line Open Access scholarly journals. In the first part (sections 1-3) we briefly describe t...
In this paper, we address the problem of extracting data records and their attributes from unstructured biomedical full text. There has been little effort reported on this in the ...
This paper addresses the problem of text extraction from name card images with fanciful design containing complicated color background and reverse contrast regions. The proposed m...