— As person names are non-unique, the same name on different Web pages might or might not refer to the same real-world person. This entity identification problem is one of the m...
Data acquisition is a major concern in text classification. The excessive human efforts required by conventional methods to build up quality training collection might not always b...
Background: Non-negative matrix factorisation (NMF), a machine learning algorithm, has been applied to the analysis of microarray data. A key feature of NMF is the ability to iden...
In order to get high-quality web pages, search engines often resort retrieval pages by their ranks. The rank is a kind of measurement of importance of pages. Famous ranking algorit...
Guang Feng, Tie-Yan Liu, Xu-Dong Zhang, Tao Qin, B...
Abstract. Understanding a large schema without the assistance of persons already familiar with it (and its associated applications), is a hard and very time consuming task that occ...