We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet...
Future scalable, high throughput, and high performance applications are likely to execute on platforms constructed by clustering multiple autonomous distributed servers, with reso...
The Social Informatics Data Grid (SIDGrid) is a new cyberinfrastructure designed to transform how social and behavioral scientists collect and annotate data, collaborate and share...
Identity theft has fostered to a major security problem on the Internet, in particular stealing passwords for web applications through phishing and malware. We present TruWallet, ...
This paper aims at discovering community structure in rich media social networks, through analysis of time-varying, multi-relational data. Community structure represents the laten...
Yu-Ru Lin, Jimeng Sun, Paul Castro, Ravi B. Konuru...