In this paper, we focus on the domain ontology acquisition from Chinese corpus by extracting rules designed for Chinese phrases. These rules are noun sequences with part-of-speech...
This paper describes LINGUA - an architecture for text processing in Bulgarian. First, the pre-processing modules for tokenisation, sentence splitting, paragraph segmentation, par...
tion. Our method was developed through the study of a corpus of abstracts written ssional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness,...
In this correspondence, we prove that the affine invariants proposed recently by Yang and Cohen [1] are algebraically dependent. We show how to select an independent and complete s...
We combine information from a language model and character image pattern matching to iteratively reduce ambiguity in document images. Combining word shape information and lists of...