It is increasingly common for users to interact with the web using a number of different aliases. This trend is a doubleedged sword. On one hand, it is a fundamental building bloc...
To take the first step beyond keyword-based search toward entity-based search, suitable token spans ("spots") on documents must be identified as references to real-world...
Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, ...
Mining discrete patterns in binary data is important for subsampling, compression, and clustering. We consider rankone binary matrix approximations that identify the dominant patt...
The increasing availability of electronic communication data, such as that arising from e-mail exchange, presents social and information scientists with new possibilities for char...
R. Dean Malmgren, Jake M. Hofman, Luis A. N. Amara...
All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...