Ensemble weighted kernel estimators for multivariate entropy estimation

Kumar Sricharan; Alfred O. Hero

2012 NIPS NeurIPS 2012

Ensemble weighted kernel estimators for multivariate entropy estimation

Abstract

The problem of estimation of entropy functionals of probability densities has received much attention in the information theory, machine learning and statistics communities. Kernel density plug-in estimators are simple, easy to implement and widely used for estimation of entropy. However, kernel plug-in estimators suffer from the curse of dimensionality, wherein the MSE rate of convergence is glacially slow - of order $O(T^{-{\gamma}/{d}})$, where $T$ is the number of samples, and $\gamma>0$ is a rate parameter. In this paper, it is shown that for sufficiently smooth densities, an ensemble of kernel plug-in estimators can be combined via a weighted convex combination, such that the resulting weighted estimator has a superior parametric MSE rate of convergence of order $O(T^{-1})$. Furthermore, it is shown that these optimal weights can be determined by solving a convex optimization problem which does not require training data or knowledge of the underlying density, and therefore can be performed offline. This novel result is remarkable in that, while each of the individual kernel plug-in estimators belonging to the ensemble suffer from the curse of dimensionality, by appropriate ensemble averaging we can achieve parametric convergence rates.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization

🧭 Keyword Pioneer — ensemble averaging

🐣 Hot Topic Early Bird — convergence rate

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

📈 Trend Setter — Ensemble Learning

Authors

Kumar Sricharan , Alfred O. Hero

Topics

Machine Learning > Core Methods > Regression Machine Learning > Optimization & Theory > Statistical Learning Mathematics & Optimization > Mathematics > Information Theory Mathematics & Optimization > Statistics Machine Learning > Optimization & Theory > Statistics Machine Learning > Learning Types > Ensemble Learning Machine Learning > Core Methods > Kernel Methods Mathematics & Optimization > Statistics > Statistics

Keywords

ensemble learning convex optimization kernel density estimation curse of dimensionality mean squared error entropy estimation ensemble averaging convergence rate ensemble method ensemble weighted kernel parametric convergence

Download PDF

Related papers

Kernel Hyperalignment 2012

Fused sparsity and robust estimation for linear models with unknown variance 2012

Slice sampling normalized kernel-weighted completely random measure mixture models 2012

Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization 2012

Matrix reconstruction with the local max norm 2012