Sciweavers

945 search results - page 97 / 189
» Robust Text Processing in Automated Information Retrieval
Sort
View
CIKM
2009
Springer
15 years 11 months ago
Automatic retrieval of similar content using search engine query interface
We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...
NIPS
2007
15 years 5 months ago
Mining Internet-Scale Software Repositories
Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated c...
Erik Linstead, Paul Rigor, Sushil Krishna Bajracha...
WWW
2005
ACM
16 years 5 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
CICLING
2005
Springer
15 years 10 months ago
Design and Development of a System for the Detection of Agreement Errors in Basque
This paper presents the design and development of a system for the detection and correction of syntactic errors in free texts. The system is composed of three main modules: a) a ro...
Arantza Díaz de Ilarraza Sánchez, Ko...
GIS
2008
ACM
16 years 5 months ago
Mapping geographic coverage of the web
In this paper, we describe a methodology to estimate the geographic coverage of the web without the need for secondary knowledge or complex geo-tagging. This is achieved by random...
Robert Pasley, Paul Clough, Ross S. Purves, Floria...