The success of the Semantic Web depends on the availability of ontologies as well as on the proliferation of web pages annotated with metadata conforming to these ontologies. Thus...
Philipp Cimiano, Siegfried Handschuh, Steffen Staa...
Could people use tagging to manage day-to-day work in their personal computing environment? Could tagging be sufficiently generic and lightweight to support diverse ways of workin...
Gerard Oleksik, Max L. Wilson, Craig S. Tashman, E...
This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work t...
The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...