This paper reports some experiments in using SVG (Scalable Vector Graphics), rather than the browser default of (X)HTML/CSS, as a potential Web-based rendering technology, in an a...
The retrieval of similar documents in the Web from a given document is different in many aspects from information retrieval based on queries generated by regular search engine use...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
We demonstrate the usefulness of the uniform resource locator (URL) alone in performing web page classification. This approach is magnitudes faster than typical web page classific...
The problem of document classification considers categorizing or grouping of various document types. Each document can be represented as a bag of words, which has no straightforw...
In this paper we address the problem of organizing hidden-Web databases. Given a heterogeneous set of Web forms that serve as entry points to hidden-Web databases, our goal is to ...