Sciweavers

3152 search results - page 481 / 631
» Retrieval of Partial Documents
Sort
View
WWW
2004
ACM
16 years 5 months ago
Towards the self-annotating web
The success of the Semantic Web depends on the availability of ontologies as well as on the proliferation of web pages annotated with metadata conforming to these ontologies. Thus...
Philipp Cimiano, Siegfried Handschuh, Steffen Staa...
KDD
2008
ACM
128views Data Mining» more  KDD 2008»
16 years 5 months ago
Scaling up text classification for large file systems
: We combine the speed and scalability of information retrieval with the generally superior classification accuracy offered by machine learning, yielding a two-phase text classifie...
George Forman, Shyamsundar Rajaram
KDD
2008
ACM
115views Data Mining» more  KDD 2008»
16 years 5 months ago
Topical query decomposition
We introduce the problem of query decomposition, where we are given a query and a document retrieval system, and we want to produce a small set of queries whose union of resulting...
Francesco Bonchi, Carlos Castillo, Debora Donato, ...
KDD
2001
ACM
187views Data Mining» more  KDD 2001»
16 years 5 months ago
Random projection in dimensionality reduction: applications to image and text data
Random projections have recently emerged as a powerful method for dimensionality reduction. Theoretical results indicate that the method preserves distances quite nicely; however,...
Ella Bingham, Heikki Mannila
WWW
2010
ACM
15 years 12 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han