We present a probabilistic model for a document corpus that combines many of the desirable features of previous models. The model is called “GaP” for Gamma-Poisson, the distri...
We estimate of the extent of phishing activity on the Internet via capture-recapture analysis of two major phishing site reports. Capture-recapture analysis is a population estima...
We show quite good face clustering is possible for a dataset of inaccurately and ambiguously labelled face images. Our dataset is 44,773 face images, obtained by applying a face f...
Tamara L. Berg, Alexander C. Berg, Jaety Edwards, ...
Queries issued by casual users or specialists exploring a data set often point us to important subsets of the data, be it clusters, outliers or other features of particular import...
The ImageCLEF Photo Retrieval Task 2009 focused on image retrieval and diversity. A new collection was utilised in this task consisting of approximately half a million images with...
Monica Lestari Paramita, Mark Sanderson, Paul Clou...