We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Social media such as Web forum often have dense interactions between user and content where network models are often appropriate for analysis. Joint non-negative matrix factorizat...
In this paper we perform a study of the image contents of the Chilean web (.cl domain) using automatic feature extraction, content-based analysis and face detection algorithms. In...
Alejandro Jaimes, Javier Ruiz-del-Solar, Rodrigo V...
During the last decade national archives, libraries, museums and companies started to make their records, books and files electronically available. In order to allow efficient ac...
Andreas Stoffel, David Spretke, Henrik Kinnemann, ...
This paper presents a novel visual approach to evaluate, in a fast and effective way, the development of new image feature extraction techniques concerning content-based image ret...