Web sites are designed for graphical mode of interaction. Sighted users can "cut to the chase" and quickly identify relevant information in Web pages. On the contrary, i...
Popular content in video sharing web sites (e.g., YouTube) is usually duplicated. Most scholars define near-duplicate video clips (NDVC) based on non-semantic features (e.g., di...
Mauro Cherubini, Rodrigo de Oliveira, Nuria Oliver
In social media, such as blogs, since the content naturally evolves over time, it is hard or in many cases impossible to organize the content for effective navigation. Thus, one c...
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail we are given a set ...
We consider the problem of content extraction from online news webpages. To explore to what extent the syntactic markup and the visual structure of a webpage facilitate the extrac...