The Distributed Information Search COmponent (Disco) is a prototype heterogeneous distributed database that accesses underlying data sources. The Disco prototype currently focuses...
World-Wide Web proxy servers that cache documents can potentially reduce three quantities: the number of requests that reach popular servers, the volume of network trac resulting ...
Marc Abrams, Charles R. Standridge, Ghaleb Abdulla...
Most of the Web-based methods for lexicon augmenting consist in capturing global semantic features of the targeted domain in order to collect relevant documents from the Web. We s...
In this paper, we present a method that automatically constructs a Named Entity (NE) tagged corpus from the web to be used for learning of Named Entity Recognition systems. We use...
Many Web sites have begun allowing users to submit items to a collection and tag them with keywords. The folksonomies built from these tags are an interesting topic that has seen ...