Tweets are the most up-to-date and inclusive stream of information and commentary on current events, but they are also fragmented and noisy, motivating the need for systems that c...
ions Sebastian Pop 1 , Albert Cohen 2 , and Georges-Andr´e Silber 1 1 CRI, Mines Paris, Fontainebleau, France 2 ALCHEMY group, INRIA Futurs, Orsay, France Abstract. This paper pre...
We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet...
A lift curve, with the true positive rate on the y-axis and the customer pull (or contact) rate on the x-axis, is often used to depict the model performance in many data mining ap...
Abstract. In this paper, we revisit the consensus of computational complexity on exact inference in Bayesian networks. We point out that even in singly connected Bayesian networks,...