Sciweavers

583 search results - page 32 / 117
» Automatic extraction of titles from general documents using ...
Sort
View
EMNLP
2009
14 years 7 months ago
Generalized Expectation Criteria for Bootstrapping Extractors using Record-Text Alignment
Traditionally, machine learning approaches for information extraction require human annotated data that can be costly and time-consuming to produce. However, in many cases, there ...
Kedar Bellare, Andrew McCallum
90
Voted
HT
2005
ACM
15 years 3 months ago
As we may perceive: inferring logical documents from hypertext
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov
MLMI
2004
Springer
15 years 2 months ago
Using Static Documents as Structured and Thematic Interfaces to Multimedia Meeting Archives
Abstract. Static documents play a central role in multimodal applications such as meeting recording and browsing. They provide a variety of structures, in particular thematic, for ...
Denis Lalanne, Rolf Ingold, Didier von Rotz, Ardhe...
ACL
2008
14 years 11 months ago
Mining Parenthetical Translations from the Web by Word Alignment
Documents in languages such as Chinese, Japanese and Korean sometimes annotate terms with their translations in English inside a pair of parentheses. We present a method to extrac...
Dekang Lin, Shaojun Zhao, Benjamin Van Durme, Mari...
CIKM
2005
Springer
15 years 3 months ago
A hybrid approach to NER by MEMM and manual rules
This paper describes a framework for defining domain specific Feature Functions in a user friendly form to be used in a Maximum Entropy Markov Model (MEMM) for the Named Entity Re...
Moshe Fresko, Binyamin Rosenfeld, Ronen Feldman