Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
Many valuable text databases on the web have non-crawlable contents that are "hidden" behind search interfaces. Metasearchers are helpful tools for searching over many s...
This paper presents a new method that uses orthogonalized features for time series clustering and classification. To cluster or classify time series data, either original data or...
With the emergence of Yahoo Pipes and several similar services, data mashup tools have started to gain interest of business users. Making these tools simple and accessible to user...
Anton Riabov, Eric Bouillet, Mark Feblowitz, Zhen ...
Email is one of the most prevalent communication tools today, and solving the email overload problem is pressingly urgent. A good way to alleviate email overload is to automatical...