In a unified knowledge representation, data, information and knowledge are all represented in a single formalism. A unified knowledge representation based on “items” is describ...
Phrase pattern recognition (phrase chunking) refers to automatic approaches for identifying predefined phrase structures in a stream of text. Support vector machines (SVMs)-based ...
This paper describes LINGUA - an architecture for text processing in Bulgarian. First, the pre-processing modules for tokenisation, sentence splitting, paragraph segmentation, par...
Text segmentation, or named text binarization, is usually an essential step for text information extraction from images and videos. However, most existing text segmentation method...
In Japanese dependency parsing, Kudo's relative preference-based method (Kudo and Matsumoto, 2005) outperforms both deterministic and probabilistic CKY-based parsing methods....