Web spam detection has become one of the top challenges for the Internet search industry. Instead of using some heuristic rules, we propose a feature re-extraction strategy to opt...
: The challenge of the semantic web is the provision of distributed information with well defined meaning, understandable for different parties. Particularly, applications should b...
Abstract. Byte pair encoding (BPE) is a simple universal text compression scheme. Decompression is very fast and requires small work space. Moreover, it is easy to decompress an ar...
We address the problem of string matching on Ziv-Lempel compressed text. The goal is to search a pattern in a text without uncompressing it. This is a highly relevant issue to keep...
In this paper, we present a text detection and localization method. Our detection technique is based on a cascade of boosted ensemble and localizer uses standard image processing ...
Shehzad Muhammad Hanif, Lionel Prevost, Pablo Negr...