Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
In this paper, we propose a new transparent built-in self-test ( TBIST ) method to test multiple embedded memory arrays with various sizes in parallel. First, a new transparent tes...
Abstract. We give the first exponential separation between quantum and bounded-error randomized one-way communication complexity. Specifically, we define the Hidden Matching Proble...
This paper presents a novel paradigm for learning languages that consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane i...
MapReduce provides a parallel and scalable programming model for data-intensive business and scientific applications. MapReduce and its de facto open source project, called Hadoop...