In many data mining applications, online labeling feedback is only available for examples which were predicted to belong to the positive class. Such applications include spam filt...
Web applications typically interact with a back-end database to retrieve persistent data and then present the data to the user as dynamically generated output, such as HTML web pa...
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
Text streams are becoming more and more ubiquitous, in the forms of news feeds, weblog archives and so on, which result in a large volume of data. An effective way to explore the...
Xiang Wang 0002, Kai Zhang, Xiaoming Jin, Dou Shen
Abstract. This paper presents a statistical framework based on Principal Component Analysis (PCA) for discovering the contextual factors which most strongly influence user behavio...