We report on the large-scale acquisition of class attributes with and without the use of lists of representative instances, as well as the discovery of unary attributes, such as t...
This paper introduces the Ariadne Corpus Management System. First, the underlying data model is presented which enables users to represent and process heterogeneous data sets with...
Active learning is well-suited to many problems in natural language processing, where unlabeled data may be abundant but annotation is slow and expensive. This paper aims to shed ...
Statistical model in retrieval has been shown to perform well empirically. Extended Boolean model has been widely used in business system for its easiness to be complemented and n...
Full-text scanning oers signicant advantages over other methods of document retrieval but is normally too slow for use on large collections. The Fujitsu AP1000 parallel distribut...