Sciweavers

322 search results - page 9 / 65
» A Software System for Topic Extraction and Document Classifi...
Sort
View
LREC
2008
141views Education» more  LREC 2008»
14 years 11 months ago
New Resources for Document Classification, Analysis and Translation Technologies
The goal of the DARPA MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Program is to automatically convert foreign language text images into Englis...
Stephanie Strassel, Lauren Friedman, Safa Ismael, ...
ICDIM
2009
IEEE
14 years 7 months ago
Text classification based on limited bibliographic metadata
In this paper, we introduce a method for categorizing digital items according to their topic, only relying on the document's metadata, such as author name and title informati...
Kerstin Denecke, Thomas Risse, Thomas Baehr
ICDAR
2009
IEEE
15 years 4 months ago
Metadata Extraction from PDF Papers for Digital Library Ingest
In this paper we analyze our recent research on the use of document analysis techniques for metadata extraction from PDF papers. We describe a package that is designed to extract ...
Simone Marinai
85
Voted
SDM
2008
SIAM
133views Data Mining» more  SDM 2008»
14 years 11 months ago
Semantic Smoothing for Bayesian Text Classification with Small Training Data
Bayesian text classifiers face a common issue which is referred to as data sparsity problem, especially when the size of training data is very small. The frequently used Laplacian...
Xiaohua Zhou, Xiaodan Zhang, Xiaohua Hu
ITCC
2005
IEEE
15 years 3 months ago
Analyzing Relations among Software Patterns based on Document Similarity
In software development, many kinds of knowledge are shared and reused as software patterns. Howevel; the relation analysis among software by hand is on the large scale. In this w...
Atsuto Kubo, Hironori Washizaki, Atsuhiro Takasu, ...