Accurate topical categorization of user queries allows for increased effectiveness, efficiency, and revenue potential in general-purpose web search systems. Such categorization be...
Steven M. Beitzel, Eric C. Jensen, Ophir Frieder, ...
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
Research shows that comment spamming (comments which are unsolicited, unrelated, abusive, hateful, commercial advertisements etc) in online discussion forums has become a common p...
Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
In this paper, we present a novel method for the classification of Web sites. This method exploits both structure and content of Web sites in order to discern their functionality....