We consider the problem of document conversion from the renderingoriented HTML markup into a semantic-oriented XML annotation defined by user-specific DTDs or XML Schema descrip...
Current data warehouse and OLAP technologies can be applied to analyze the structured data that companies store in their databases. The circumstances that describe the context ass...
The signal to noise ratio is a common concept in radio communications and electronic communication in general. For a radio, the static is the noise. Too much static and the storm ...
Retrieving documents by subject matter is the general goal of information retrieval and other content access systems. There are other aspects of textual content, however, which fo...
This paper presents an objective comparative evaluation of layout analysis methods in realistic circumstances. It describes the Page Segmentation competition (modus operandi, data...
Apostolos Antonacopoulos, Stefan Pletschacher, Dav...