The task in the computer security domain of anomaly detection is to characterize the behaviors of a computer user (the `valid', or `normal' user) so that unusual occurre...
Authorship attribution is the task of identifying the author of a given text. The main concern of this task is to define an appropriate characterization of documents that captures ...
Scalable data mining in large databases is one of today's challenges to database technologies. Thus, substantial effort is dedicated to a tight coupling of database and data ...
We study the usability of linguistic features in the Web spam classification task. The features were computed on two Web spam corpora: Webspam-Uk2006 and Webspam-Uk2007, we make t...
Abstract. Probabilistic models with hidden variables such as probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet Allocation (LDA) have recently become popular for so...