Script identification is required for a multilingual OCR system. In this paper, we present a novel and efficient technique for Bangla/English script identification with application...
Organizing Web search results into clusters facilitates users' quick browsing through search results. Traditional clustering techniques are inadequate since they don't g...
We present a perceptually designed hardwareaccelerated algorithm for generating unique background textures for distinguishing documents. To be recognizable, the texture should pro...
A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-re...
Chunyan Liang, Li Guo, Zhaojie Xia, Feng-Guang Nie...
As huge quantities of documents have become available, services using natural language processing technologies trained by huge corpora have emerged, such as information retrieval ...