In this paper, we present YAM, a schema matcher factory. YAM (Yet Another Matcher) is not (yet) another schema matching system as it enables the generation of a la carte schema ma...
Well-designed indices can dramatically improve query performance. Including query workload information can produce indices that yield better overall throughput while balancing the...
Discovering correspondences between schema elements is a crucial task for data integration. Most schema matching tools are semiautomatic, e.g. an expert must tune some parameters ...
Information retrieval systems conventionally assess document relevance using the bag of words model. Consequently, relevance scores of documents retrieved for different queries a...
Deepak Agarwal, Evgeniy Gabrilovich, Robert Hall, ...
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...