Sciweavers

JCDL
2010
ACM

ProcessTron: efficient semi-automated markup generation for scientific documents

13 years 9 months ago
ProcessTron: efficient semi-automated markup generation for scientific documents
Digitizing legacy documents and marking them up with XML is important for many scientific domains. However, creating comprehensive semantic markup of high quality is challenging. Respective processes consist of many steps, with automated markup generation and intermediate manual correction. These corrections are extremely laborious. To reduce this effort, this paper makes two contributions: First, it proposes ProcessTron, a lightweight markup-process-control mechanism. ProcessTron assists users in two ways: It ensures that the steps are executed in the appropriate order, and it points the user to possible errors during manual correction. Second, ProcessTron has been deployed in real-world projects, and this paper reports on our experiences. A core observation is that ProcessTron more than halves the time users need to mark up a document. Results from laboratory experiments, which we have conducted as well, confirm this finding.
Guido Sautter, Klemens Böhm, Conny Kühne
Added 10 Jul 2010
Updated 10 Jul 2010
Type Conference
Year 2010
Where JCDL
Authors Guido Sautter, Klemens Böhm, Conny Kühne, Tobias Mathäß
Comments (0)