Eigenvectors of data matrices play an important role in many computational problems, ranging from signal processing to machine learning and control. For instance, algorithms that ...
64-bit address spaces are increasingly important for modern applications, but they come at a price: pointers use twice as much memory, reducing the effective cache capacity and m...
Abstract--Randomization is a general technique for evaluating the significance of data analysis results. In randomizationbased significance testing, a result is considered to be in...
Commercial datasets are often large, relational, and dynamic. They contain many records of people, places, things, events and their interactions over time. Such datasets are rarel...
Andrew Fast, Lisa Friedland, Marc Maier, Brian Tay...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...