—Over the past few years, XML (eXtensible Mark-up Language) has emerged as the standard for information representation and data exchange over the Internet. This paper provides a ...
Within the larger area of automatic acquisition of knowledge from the Web, we introduce a method for extracting relevant attributes, or quantifiable properties, for various class...
This paper presents a novel method for acquiring a set of query patterns to retrieve documents containing important information about an entity. Given an existing Wikipedia catego...
Methods that reduce the amount of labeled data needed for training have focused more on selecting which documents to label than on which queries should be labeled. One exception t...
In this paper, we present two ways to improve the precision of HITS-based algorithms on Web documents. First, by analyzing the limitations of current HITS-based algorithms, we pro...