Search engines crawl and index webpages depending upon their informative content. However, webpages — especially dynamically generated ones — contain items that cannot be clas...
Data deduplication has become a popular technology for reducing the amount of storage space necessary for backup and archival data. Content defined chunking (CDC) techniques are w...
For the tasks of classification, two types of patterns can generate problems: ambiguous patterns and outliers. Furthermore, it is possible to separate classification algorithms in...
Jonathan Milgram, Mohamed Cheriet, Robert Sabourin
Robust clustering of data into overlapping linear subspaces is a common problem. Here we consider one-dimensional subspaces that cross the origin. This problem arises in blind sour...
Independent Component Analysis is the best known method for solving blind source separation problems. In general, the number of sources must be known in advance. In many cases, pre...
Andreas Sandmair, Alam Zaib, Fernando Puente Le&oa...