Sciweavers

1261 search results - page 191 / 253
» Extracting Text from PostScript
Sort
View
PAKDD
2009
ACM
116views Data Mining» more  PAKDD 2009»
15 years 4 months ago
Scalable Web Mining with Newistic
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Ovidiu Dan, Horatiu Mocian
SEMCO
2007
IEEE
15 years 4 months ago
Intelligent Parsing of Scanned Volumes for Web Based Archives
The proliferation of digital libraries and the large amount of existing documents raise important issues in efficient handling of documents. Printed texts in documents need to be...
Xiaonan Lu, James Ze Wang, C. Lee Giles
EWCBR
2004
Springer
15 years 3 months ago
Textual Reuse for Email Response
The case-based reasoning approach to email response consists of reusing past messages to synthesize new responses to incoming requests. This task presents various challenges due to...
Luc Lamontagne, Guy Lapalme
EMNLP
2008
14 years 11 months ago
Scaling Textual Inference to the Web
Most Web-based Q/A systems work by finding pages that contain an explicit answer to a question. These systems are helpless if the answer has to be inferred from multiple sentences...
Stefan Schoenmackers, Oren Etzioni, Daniel S. Weld
LREC
2008
57views Education» more  LREC 2008»
14 years 11 months ago
A Development Environment for Configurable Meta-Annotators in a Pipelined NLP Architecture
Information extraction from large data repositories is critical to Information Management solutions. In addition to prerequisite corpus analysis, to determine domain-specific char...
Youssef Drissi, Branimir Boguraev, David Ferrucci,...