Sciweavers

20 search results - page 4 / 4
» A robust technique for text extraction in mixed-type binary ...
Sort
View
ICDIM
2008
IEEE
13 years 11 months ago
Unsupervised key-phrases extraction from scientific papers using domain and linguistic knowledge
The domain of Digital Libraries presents specific challenges for unsupervised information extraction to support both the automatic classification of documents and the enhancement ...
Mikalai Krapivin, Maurizio Marchese, Andrei Yadran...
CICLING
2009
Springer
14 years 5 months ago
Business Specific Online Information Extraction from German Websites
This paper presents a system that uses the domain name of a German business website to locate its information pages (e.g. company profile, contact page, imprint) and then identifi...
Yeong Su Lee, Michaela Geierhos
ANLP
1997
169views more  ANLP 1997»
13 years 6 months ago
Building Effective Queries In Natural Language Information Retrieval
In this paper we report on our natural language information retrieval (NLIR) project as related to the recently concluded 5th Text Retrieval Conference (TREC-5). The main thrust o...
Tomek Strzalkowski, Fang Lin, Jose Perez Carballo,...
PAMI
2007
101views more  PAMI 2007»
13 years 4 months ago
A Thousand Words in a Scene
— This paper presents a novel approach for visual scene modeling and classification, investigating the combined use of text modeling methods and local invariant features. Our wo...
Pedro Quelhas, Florent Monay, Jean-Marc Odobez, Da...
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 2 days ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...