Sciweavers

3180 search results - page 495 / 636
» Knowledge-based Document Analysis
Sort
View
WWW
2004
ACM
16 years 5 months ago
Towards the self-annotating web
The success of the Semantic Web depends on the availability of ontologies as well as on the proliferation of web pages annotated with metadata conforming to these ontologies. Thus...
Philipp Cimiano, Siegfried Handschuh, Steffen Staa...
CHI
2009
ACM
16 years 5 months ago
Lightweight tagging expands information and activity management practices
Could people use tagging to manage day-to-day work in their personal computing environment? Could tagging be sufficiently generic and lightweight to support diverse ways of workin...
Gerard Oleksik, Max L. Wilson, Craig S. Tashman, E...
KDD
2006
ACM
177views Data Mining» more  KDD 2006»
16 years 5 months ago
Topics over time: a non-Markov continuous-time model of topical trends
This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work t...
Xuerui Wang, Andrew McCallum
ECIR
2009
Springer
16 years 2 months ago
A Topic-Based Measure of Resource Description Quality for Distributed Information Retrieval
The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
Mark Baillie, Mark James Carman, Fabio Crestani
WWW
2010
ACM
16 years 1 days ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han