Web entities, such as documents and hyperlinks, are created for different purposes, or intents. Existing intent-based retrieval methods largely focus on information seekers’ int...
The Word Wide Web has becoming one of the most important information repositories. However, information in web pages is free of standards in presentation, without being organized i...
This paper proposes a method of crawling Web servers connected to the Internet without imposing a high processing load. We are using the crawler for a field survey of the digital ...
Katsuko T. Nakahira, Tetsuya Hoshino, Yoshiki Mika...
Environmental engineers from different organizations work in interdisciplinary projects having the need of information exchange. In particular, a collaborative environment with pe...
Text classification categories Web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consumi...