In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high...
Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu...
This paper addresses Named Entity Mining (NEM), in which we mine knowledge about named entities such as movies, games, and books from a huge amount of data. NEM is potentially use...
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
In recent years, search engine research has grown rapidly in areas such as algorithms, strategies and architecture, increasing both effectiveness and quality of results. However, ...
Patrizia Andronico, Marina Buzzi, Barbara Leporini
We propose to model relative attributes1 that capture the relationships between images and objects in terms of human-nameable visual properties. For example, the models can captur...