In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
The explosion of user-generated content on the Web has led to new opportunities and significant challenges for companies, that are increasingly concerned about monitoring the disc...
Controlled experiments, also called randomized experiments and A/B tests, have had a profound influence on multiple fields, including medicine, agriculture, manufacturing, and adv...
Thomas Crook, Brian Frasca, Ron Kohavi, Roger Long...
All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...
Wi-Fi clients can obtain much better performance at some commercial hotspots than at others. Unfortunately, there is currently no way for users to determine which hotspot access p...
Jeffrey Pang, Ben Greenstein, Michael Kaminsky, Da...