Word meaning ambiguity has always been an important problem in information retrieval and extraction, as well as, text mining (documents clustering and classification). Knowledge di...
Henryk Rybinski, Marzena Kryszkiewicz, Grzegorz Pr...
This paper presents a new context-based method for automatic detection and extraction of similar and related words from texts. Finding similar words is a very important task for m...
Various approaches for plagiarism detection exist. All are based on more or less sophisticated text analysis methods such as string matching, fingerprinting or style comparison. I...
We present a new statistical compression method, which we call Phrase Based Dense Code (PBDC), aimed at compressing large digital libraries. PBDC compresses the text collection to ...
As camera resolution increases, high-speed non-contact text capture through a digital camera is opening up a new channel for document capture and understanding. Unfortunately, per...