Web-page classification is much more difficult than pure-text classification due to a large variety of noisy information embedded in Web pages. In this paper, we propose a new Web...
Query-oriented summarization aims at extracting an informative summary from a document collection for a given query. It is very useful to help users grasp the main information rel...
We investigate methods of using CRC32 for compressing Web URL strings and sharing of URL lists between servers, caches, and URL switches. Using trace-based evaluation, we compare ...
Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particula...
Daniel M. Dunlavy, Dianne P. O'Leary, John M. Conr...
Users prefer to navigate subjects from organized topics in an abundance resources than to list pages retrieved from search engines. We propose a framework to cluster frequent items...