Increasingly, many data sources appear as online databases, hidden behind query forms, thus forming what is referred to as the deep web. It is desirable to have systems that can pr...
One challenge for relevance ranking in Web search is underspecified queries. For such queries, top-ranked documents may contain information irrelevant to the search goal of the us...
IPv6 embraces various good features from the security perspective, but the improvements also bring us some new challenges for web content filtering. This paper presents a new fram...
Web spam detection has become one of the top challenges for the Internet search industry. Instead of using some heuristic rules, we propose a feature re-extraction strategy to opt...
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...