This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content. The hierarchical structure is initially used to train diffe...
Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times....
Digital libraries are a core information technology. When the stored data is complex, e.g. high-resolution images or molecular protein structures, simple query types like the exac...
We describe a novel family of PAC model algorithms for learning linear threshold functions. The new algorithms work by boosting a simple weak learner and exhibit complexity bounds...
Abstract. In constraint networks, the efficiency of a search algorithm is strongly related to the local consistency maintained during search. For a long time, it has been consider...