Sciweavers

1261 search results - page 123 / 253
» Extracting Text from PostScript
Sort
View
SIGMOD
2009
ACM
269views Database» more  SIGMOD 2009»
15 years 10 months ago
Efficient approximate entity extraction with edit distance constraints
Named entity recognition aims at extracting named entities from unstructured text. A recent trend of named entity recognition is finding approximate matches in the text with respe...
Wei Wang 0011, Chuan Xiao, Xuemin Lin, Chengqi Zha...
EMNLP
2009
14 years 7 months ago
Toward Completeness in Concept Extraction and Classification
Many algorithms extract terms from text together with some kind of taxonomic classification (is-a) link. However, the general approaches used today, and specifically the methods o...
Eduard H. Hovy, Zornitsa Kozareva, Ellen Riloff
IJDAR
2008
92views more  IJDAR 2008»
14 years 10 months ago
Mobile Retriever: access to digital documents from their physical source
In this paper we describe an image based document retrieval system which runs on camera enabled mobile devices. "Mobile Retriever" aims to seamlessly link physical and di...
Xu Liu, David S. Doermann
COMAD
2009
14 years 11 months ago
Business Insight from Collection of Unstructured Formatted Documents with IBM Content Harvester
In this paper, we report the development and experiments of IBM Content Harvester (CH), a tool to analyze and recover templates and content from word processor created text docume...
Biplav Srivastava, Yuan-Chi Chang
CIMCA
2005
IEEE
15 years 3 months ago
Improving Rule Generation Precision for Domain Knowledge based Wrappers
Wrappers play an important role in extracting specified information from various sources. Wrapper rules by which information is extracted are often created from the domain-specifi...
Chang-Hoo Jeong, Sung-Jin Jhun, Myung-Eun Lim, Sun...