Carnegie Mellon University
Browse
file.pdf (1.06 MB)

Nonparametric Divergence Estimation and its Applications to Machine Learning

Download (1.06 MB)
journal contribution
posted on 2011-01-01, 00:00 authored by Barnabas Poczos, Liang Xiong, Jeff Schneider

Low-dimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. Here we consider the setting where each instance of the inputs corresponds to a continuous probability distribution. These distributions are unknown to us, but we are given some i.i.d. samples from each of them. While most of the existing machine learning methods operate on points, i.e. finite-dimensional feature vectors, in our setting we study algorithms that operate on groups, i.e. sets of feature vectors. For this purpose, we propose new nonparametric, consistent estimators for a large family of divergences and describe how to apply them for machine learning problems. As important special cases, the estimators can be used to estimate R´enyi, Tsallis, Kullback-Leibler, Hellinger, Bhattacharyya distance, L2 divergences, and mutual information. We present empirical results on synthetic data, real word images, and astronomical data sets.

History

Date

2011-01-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC