Sciweavers

4645 search results - page 72 / 929
» Using Information Extraction to Improve Document Retrieval
Sort
View
ICDIM
2008
IEEE
15 years 11 months ago
Unsupervised key-phrases extraction from scientific papers using domain and linguistic knowledge
The domain of Digital Libraries presents specific challenges for unsupervised information extraction to support both the automatic classification of documents and the enhancement ...
Mikalai Krapivin, Maurizio Marchese, Andrei Yadran...
WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
16 years 1 months ago
Boilerplate Detection using Shallow Text Features
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
Christian Kohlschütter, Peter Fankhauser, Wol...
ICDAR
2009
IEEE
15 years 11 months ago
Enhanced Text Extraction from Arabic Degraded Document Images Using EM Algorithm
This paper presents a new enhanced text extraction algorithm from degraded document images on the basis of the probabilistic models. The observed document image is considered as a...
Wafa Boussellaa, Aymen Bougacha, Abderrazak Zahour...
SIGIR
2010
ACM
15 years 8 months ago
Hierarchical pitman-yor language model for information retrieval
In this paper, we propose a new application of Bayesian language model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distr...
Saeedeh Momtazi, Dietrich Klakow
CLIN
2001
15 years 5 months ago
Creating a Dutch Information Retrieval Test Corpus
This paper describes the first large-scale evaluation of information retrieval systems using Dutch documents and queries. We describe in detail the characteristics of the Dutch te...
Djoerd Hiemstra, David van Leeuwen