Sciweavers

483 search results - page 32 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
CLEF
2010
Springer
15 years 4 months ago
ZOT! to Wikipedia Vandalism - Lab Report for PAN at CLEF 2010
Abstract This vandalism detector uses features primarily derived from a wordpreserving differencing of the text for each Wikipedia article from before and after the edit, along wit...
James White, Rebecca Maessen
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
16 years 3 months ago
Efficient methods for topic model inference on streaming document collections
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Limin Yao, David M. Mimno, Andrew McCallum
LREC
2008
141views Education» more  LREC 2008»
15 years 4 months ago
New Resources for Document Classification, Analysis and Translation Technologies
The goal of the DARPA MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Program is to automatically convert foreign language text images into Englis...
Stephanie Strassel, Lauren Friedman, Safa Ismael, ...
COMPSAC
2004
IEEE
15 years 6 months ago
N-Gram-Based Detection of New Malicious Code
The current commercial anti-virus software detects a virus only after the virus has appeared and caused damage. Motivated by the standard signature-based technique for detecting v...
Tony Abou-Assaleh, Nick Cercone, Vlado Keselj, Ray...
ICML
2006
IEEE
16 years 3 months ago
Locally adaptive classification piloted by uncertainty
Locally adaptive classifiers are usually superior to the use of a single global classifier. However, there are two major problems in designing locally adaptive classifiers. First,...
Juan Dai, Shuicheng Yan, Xiaoou Tang, James T. Kwo...