Sciweavers

3961 search results - page 604 / 793
» Algorithmic Statistics
Sort
View
145
Voted
WWW
2005
ACM
16 years 7 months ago
A framework for determining necessary query set sizes to evaluate web search effectiveness
We describe a framework of bootstrapped hypothesis testing for estimating the confidence in one web search engine outperforming another over any randomly sampled query set of a gi...
Eric C. Jensen, Steven M. Beitzel, Ophir Frieder, ...
CAV
2009
Springer
187views Hardware» more  CAV 2009»
16 years 7 months ago
A Markov Chain Monte Carlo Sampler for Mixed Boolean/Integer Constraints
We describe a Markov chain Monte Carlo (MCMC)-based algorithm for sampling solutions to mixed Boolean/integer constraint problems. The focus of this work differs in two points from...
Nathan Kitchen, Andreas Kuehlmann
KDD
2007
ACM
155views Data Mining» more  KDD 2007»
16 years 6 months ago
Mining templates from search result records of search engines
Metasearch engine, Comparison-shopping and Deep Web crawling applications need to extract search result records enwrapped in result pages returned from search engines in response ...
Hongkun Zhao, Weiyi Meng, Clement T. Yu
165
Voted
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
16 years 6 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
KDD
2003
ACM
148views Data Mining» more  KDD 2003»
16 years 6 months ago
Mining data records in Web pages
A large amount of information on the Web is contained in regularly structured objects, which we call data records. Such data records are important because they often present the e...
Bing Liu, Robert L. Grossman, Yanhong Zhai