Sciweavers

2988 search results - page 531 / 598
» Experiments with a New Boosting Algorithm
Sort
View
SIGIR
2008
ACM
14 years 9 months ago
SpotSigs: robust and efficient near duplicate detection in large web collections
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
VLDB
2002
ACM
161views Database» more  VLDB 2002»
14 years 9 months ago
Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection
Many valuable text databases on the web have non-crawlable contents that are "hidden" behind search interfaces. Metasearchers are helpful tools for searching over many s...
Panagiotis G. Ipeirotis, Luis Gravano
ISNN
2011
Springer
14 years 20 days ago
Orthogonal Feature Learning for Time Series Clustering
This paper presents a new method that uses orthogonalized features for time series clustering and classification. To cluster or classify time series data, either original data or...
Xiaozhe Wang, Leo Lopes
WWW
2008
ACM
15 years 10 months ago
Wishful search: interactive composition of data mashups
With the emergence of Yahoo Pipes and several similar services, data mashup tools have started to gain interest of business users. Making these tools simple and accessible to user...
Anton Riabov, Eric Bouillet, Mark Feblowitz, Zhen ...
KDD
2009
ACM
180views Data Mining» more  KDD 2009»
15 years 10 months ago
Mining social networks for personalized email prioritization
Email is one of the most prevalent communication tools today, and solving the email overload problem is pressingly urgent. A good way to alleviate email overload is to automatical...
Shinjae Yoo, Yiming Yang, Frank Lin, Il-Chul Moon