Sciweavers

JUCS
2011
97views more  JUCS 2011»
12 years 11 months ago
An OCR Free Method for Word Spotting in Printed Documents: the Evaluation of Different Feature Sets
: An OCR free word spotting method is developed and evaluated under a strong experimental protocol. Different feature sets are evaluated under the same experimental conditions. In ...
Israel Rios, Alceu de Souza Britto Jr., Alessandro...
JUCS
2011
85views more  JUCS 2011»
12 years 11 months ago
Visualizing and Analyzing the Quality of XML Documents
: In this paper we introduce eXVisXML, a visual tool to explore documents annotated with the mark-up language XML, in order to easily perform over them tasks as knowledge extractio...
Daniela Carneiro da Cruz, Pedro Rangel Henriques
COLING
2010
12 years 11 months ago
Towards Automatic Building of Document Keywords
Document keywords are associated to documents as summarized versions of the documents' content. Considering that the number of documents is quickly growing every day, the ava...
Joaquim Silva, José Gabriel Lopes
AAAI
2010
13 years 1 months ago
A Topic Model for Linked Documents and Update Rules for its Estimation
The latent topic model plays an important role in the unsupervised learning from a corpus, which provides a probabilistic interpretation of the corpus in terms of the latent topic...
Zhen Guo, Shenghuo Zhu, Zhongfei Zhang, Yun Chi, Y...
IFIP12
2009
13 years 2 months ago
Preferential Infinitesimals for Information Retrieval
In this paper, we propose a preference framework for information retrieval in which the user and the system administrator are enabled to express preference annotations on search ke...
Maria Chowdhury, Alex Thomo, William W. Wadge
ICTIR
2009
Springer
13 years 2 months ago
An Effective Approach to Verbose Queries Using a Limited Dependencies Language Model
Intuitively, any `bag of words' approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies ...
Eduard Hoenkamp, Peter Bruza, Dawei Song, Qiang Hu...
ICMCS
2009
IEEE
151views Multimedia» more  ICMCS 2009»
13 years 2 months ago
High accuracy and language independent document retrieval with a Fast Invariant Transform
This paper presents a tool and a novel Fast Invariant Transform (FIT) algorithm for language independent e-documents access. The tool enables a person to access an e-document thro...
Qiong Liu, Hironori Yano, Don Kimber, Chunyuan Lia...
ICFCA
2009
Springer
13 years 2 months ago
A Concept Lattice-Based Kernel for SVM Text Classification
Abstract. Standard Support Vector Machines (SVM) text classification relies on bag-of-words kernel to express the similarity between documents. We show that a document lattice can ...
Claudio Carpineto, Carla Michini, Raffaele Nicolus...
ICDM
2009
IEEE
162views Data Mining» more  ICDM 2009»
13 years 2 months ago
Towards a Universal Text Classifier: Transfer Learning Using Encyclopedic Knowledge
Document classification is a key task for many text mining applications. However, traditional text classification requires labeled data to construct reliable and accurate classifie...
Pu Wang, Carlotta Domeniconi
ICDM
2009
IEEE
183views Data Mining» more  ICDM 2009»
13 years 2 months ago
Multirelational Topic Models
In this paper we propose the multirelational topic model (MRTM) for multiple types of link modeling such as citation and coauthor links in document networks. In the citation networ...
Jia Zeng, William K. Cheung, Chun-hung Li, Jiming ...