Over 200 CVS repositories representing the assignments of students in a second year undergraduate computer science course have been assembled. This unique data set represents many...
Keir Mierle, Kevin Laven, Sam T. Roweis, Greg Wils...
As online social networking sites become more and more popular, they have also attracted the attentions of the spammers. In this paper, Twitter, a popular micro-blogging service, i...
We describe experiments with a Naive Bayes text classifier in the context of anti-spam E-mail filtering, using two different statistical event models: a multi-variate Bernoulli ...
We present a corpus{based approach to word{sense disambiguation that only requires information that can be automatically extracted from untagged text. We use unsupervised techniqu...
Abstract. This paper explores techniques in the pipeline of image description based on visual codebooks suitable for video on-line processing. The pipeline components are (i) extra...