This paper presents a novel algorithm to cluster emails according to their contents and the sentence styles of their subject lines. In our algorithm, natural language processing t...
Classification is a well-established operation in text mining. Given a set of labels A and a set DA of training documents tagged with these labels, a classifier learns to assign l...
The aggregation and comparison of behavioral patterns on the WWW represent a tremendous opportunity for understanding past behaviors and predicting future behaviors. In this paper...
Eytan Adar, Daniel S. Weld, Brian N. Bershad, Stev...
We investigate three methods for defining a session on Web search engines. We examine 2,465,145 interactions from 534,507 Web searchers. We compare defining sessions using: 1) Int...
Abstract--We address the problem of detecting characteristic patterns in communication networks. We introduce a scalable approach based on set-system discrepancy. By implicitly lab...