Abstract. In this paper, I present the class of linear context free languages (LCFLs) with a class of non-deterministic one-way two-head (read only) automata, called non-determinis...
We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
The paper describes the first version of the TextMOLE (Text Mining Operations Library and Environment) system for textual data mining. Currently TextMOLE acts as an advanced inde...
XML is successful as a machine processable data interchange format, but it is often too verbose for human use. For this reason, many XML languages permit an alternative more legib...
We present an algorithm for learning context free grammars from positive structural examples (unlabeled parse trees). The algorithm receives a parameter in the form of a finite se...