Sciweavers

6 search results - page 2 / 2
» Towards a Canonical and Structured Representation of PDF Doc...
Sort
View
DOCENG
2005
ACM
13 years 6 months ago
Injecting information into atomic units of text
This paper presents a new approach to text processing, based on textemes. These are atomic text units generalising the concepts of character and glyph by merging them in a common ...
Yannis Haralambous, Gábor Bella