A number of projects are creating searchable digital libraries of printed books. These include the Million Book Project, the Google Book project and similar efforts from Yahoo an...
A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...
With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...
This research is about automatic identification and extraction of person names in Chinese text documents. Solutions to this problem have immediate and extensive applications in ma...
In this paper, we present an automatic web image mining system towards building a universal human age estimator based on facial information, which is applicable to all ethnic grou...