In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high...
Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu...
Semantic heterogeneity of information is a major barrier of information and system interoperability. Defining ontology of data and mapping ontologies among heterogeneous informati...
We propose an agent for exploring and categorizing documents on the World Wide Web based on a user pro le. The heart of the agent is an automatic categorization of a set of docume...
Eui-Hong Han, Daniel Boley, Maria L. Gini, Robert ...
In order to artificially boost the rank of commercial pages in search engine results, search engine optimizers pay for links to these pages on other websites. Identifying paid lin...
Today, search engine is the most commonly used tool for Web information retrieval, however, its current status is still far from satisfaction. This paper focuses on clustering Web...