Sciweavers

587 search results - page 57 / 118
» New Algorithms for Text Fingerprinting
Sort
View
ECIR
2008
Springer
14 years 11 months ago
Semi-supervised Document Classification with a Mislabeling Error Model
Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...
Anastasia Krithara, Massih-Reza Amini, Jean-Michel...
WWW
2005
ACM
15 years 10 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
AC
2003
Springer
15 years 3 months ago
Concentration Inequalities
Concentration inequalities deal with deviations of functions of independent random variables from their expectation. In the last decade new tools have been introduced making it pos...
Stéphane Boucheron, Gábor Lugosi, Ol...
ICDAR
2009
IEEE
14 years 7 months ago
Document Image Binarisation Using Markov Field Model
This paper presents a new approach for the binarization of seriously degraded manuscript. We introduce a new technique based on a Markov Random Field (MRF) model of the document. ...
Thibault Lelore, Frédéric Bouchara
WWW
2002
ACM
15 years 10 months ago
Using web structure for classifying and describing web pages
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...