Non-parametric Jensen-Shannon Divergence

7 years 10 months ago
Non-parametric Jensen-Shannon Divergence
Quantifying the difference between two distributions is a common problem in many machine learning and data mining tasks. What is also common in many tasks is that we only have empirical data. That is, we do not know the true distributions nor their form, and hence, before we can measure their divergence we first need to assume a distribution or perform estimation. For exploratory purposes this is unsatisfactory, as we want to explore the data, not our expectations. In this paper we study how to non-parametrically measure the divergence between two distributions. More in particular, we formalise the well-known JensenShannon divergence using cumulative distribution functions. This allows us to calculate divergences directly and efficiently from data without the need for estimation. Moreover, empirical evaluation shows that our method performs very well in detecting differences between distributions, outperforming the state of the art in both statistical power and efficiency for a wide...
Hoang Vu Nguyen, Jilles Vreeken
Added 16 Apr 2016
Updated 16 Apr 2016
Type Journal
Year 2015
Where PKDD
Authors Hoang Vu Nguyen, Jilles Vreeken
Comments (0)