Sciweavers

469 search results - page 25 / 94
» On Compressing the Textual Web
Sort
View
RIAO
1997
14 years 11 months ago
An Analysis of Statistical and Syntactic Phrases
As the amount of textual information available through the World Wide Web grows, there is a growing need for high-precision IR systems that enable a user to nd useful information ...
Mandar Mitra, Chris Buckley, Amit Singhal, Claire ...
MMS
2006
14 years 9 months ago
A probabilistic semantic model for image annotation and multi-modal image retrieval
This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilis...
Ruofei Zhang, Zhongfei (Mark) Zhang, Mingjing Li, ...
IADIS
2003
14 years 11 months ago
SPLAT: A System for Self-Plagiarism Detection
This paper presents a system for self-plagiarism detection, SPLAT. The system uses a WebL web spider that crawls through the web sites of the top fifty Computer Science department...
Christian S. Collberg, Stephen G. Kobourov, Joshua...
WWW
2007
ACM
15 years 10 months ago
Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds
As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly sup...
Marius Pasca
WWW
2004
ACM
15 years 10 months ago
Web data integration using approximate string join
Web data integration is an important preprocessing step for web mining. It is highly likely that several records on the web whose textual representations differ may represent the ...
Yingping Huang, Gregory R. Madey