The tremendous growth of the Internet has significantly reduced the cost of obtaining and sharing information about individuals, raising many concerns about user privacy. Spatial...
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...
—Traditional approaches for wireless sensor network diagnosis are mainly sink-based. They actively collect global evidences from sensor nodes to the sink so as to conduct central...
The paper is concerned with learning to rank, which is to construct a model or a function for ranking objects. Learning to rank is useful for document retrieval, collaborative fil...
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, Han...