We address two problems of technical authors in structured environments: (1) Structure definitions of the SGML school are limiting: they require one primary hierarchy and do not c...
This paper reports on the development of specific slicing techniques for functional programs and their use for the identification of possible coherent components from monolithic c...
This paper introduces a framework for clarifying and formalizing the duplicate document detection problem. Four distinct models are presented, each with a corresponding algorithm ...
We examine clarification dialogue, a mechanism for refining user questions with follow-up questions, in the context of open domain Question Answering systems. We develop an algori...
A large annotated corpus is critical to the development of robust optical character recognizers (OCRs). However, creation of annotated corpora is a tedious task. It is laborious, ...