Sciweavers

114 search results - page 11 / 23
» Automatic Text Decomposition Using Text Segments and Text Th...
Sort
View
ICDM
2008
IEEE
147views Data Mining» more  ICDM 2008»
15 years 6 months ago
Clustering Documents with Active Learning Using Wikipedia
Wikipedia has been applied as a background knowledge base to various text mining problems, but very few attempts have been made to utilize it for document clustering. In this pape...
Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...
CVPR
2009
IEEE
15 years 2 months ago
Robust unsupervised segmentation of degraded document images with topic models
Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult....
Timothy J. Burns, Jason J. Corso
EMNLP
2007
15 years 1 months ago
Semi-Markov Models for Sequence Segmentation
In this paper, we study the problem of automatically segmenting written text into paragraphs. This is inherently a sequence labeling problem, however, previous approaches ignore t...
Qinfeng Shi, Yasemin Altun, Alex J. Smola, S. V. N...
ICMCS
2005
IEEE
100views Multimedia» more  ICMCS 2005»
15 years 5 months ago
Infolink: Analysis of Dutch Broadcast News and Cross-Media Browsing
In this paper, a cross-media browsing demonstrator named InfoLink is described. InfoLink automatically links the content of Dutch broadcast news videos to related information sour...
Jeroen Morang, Roeland Ordelman, Franciska de Jong...
FSMNLP
2005
Springer
15 years 5 months ago
TAGH: A Complete Morphology for German Based on Weighted Finite State Automata
TAGH is a system for automatic recognition of German word forms. It is based on a stem lexicon with allomorphs and a concatenative mechanism for inflection and word formation. Wei...
Alexander Geyken, Thomas Hanneforth