Sciweavers

160 search results - page 4 / 32
» Exploiting structural information for semi-structured docume...
Sort
View
AIIA
2005
Springer
13 years 11 months ago
A Semantic Kernel to Exploit Linguistic Knowledge
Abstract. Improving accuracy in Information Retrieval tasks via semantic information is a complex problem characterized by three main aspects: the document representation model, th...
Roberto Basili, Marco Cammisa, Alessandro Moschitt...
ICDAR
2009
IEEE
13 years 3 months ago
Using top n Recognition Candidates to Categorize On-line Handwritten Documents
The traditional weighting schemes used in text categorization for the vector space model (VSM) cannot exploit information intrinsic to texts obtained through on-line handwriting r...
Sebastián Peña Saldarriaga, Emmanuel...
KDD
2008
ACM
120views Data Mining» more  KDD 2008»
14 years 6 months ago
Entity categorization over large document collections
Extracting entities (such as people, movies) from documents and identifying the categories (such as painter, writer) they belong to enable structured querying and data analysis ov...
Arnd Christian König, Rares Vernica, Venkates...
CIKM
2008
Springer
13 years 7 months ago
Semi-supervised text categorization by active search
In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high...
Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu...
WEBI
2005
Springer
13 years 11 months ago
A Semi-Supervised Document Clustering Algorithm Based on EM
Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the cat...
Leonardo Rigutini, Marco Maggini