In this paper, we identify and analyze structural properties which reflect the functionality of a Web site. These structural properties consider the size, the organization, the co...
Agents (hardware or software) that act autonomously in an environment have to be able to integrate three basic behaviors: planning, execution, and learning. This integration is man...
Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which...
Yiming Yang, Jian Zhang, Jaime G. Carbonell, Chun ...
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...
Distributed heterogeneous search systems are an emerging phenomenon in Web search, in which independent topic-specific search engines provide search services, and metasearchers d...