Deep regression learning from dependent observations with minimum error entropy principle

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

This paper introduces a deep regression learning approach for nonparametric regression using strongly mixing observations, a scenario where data points are dependent rather than independent. The method employs deep neural networks (DNNs) combined with the minimum error entropy (MEE) principle, which considers all moments of the error variable, offering robustness against non-Gaussian and heavy-tailed noise, unlike traditional $L_2$ (least squares) loss functions. Two specific estimators are analyzed: the non-penalized deep neural network (NPDNN) and the sparse-penalized deep neural network (SPDNN) predictors. The authors establish upper bounds for the expected excess risk of both estimators over Hölder and composition Hölder function classes. For models with Gaussian error, these MEE-based estimators achieve minimax optimal convergence rates, matching existing lower bounds up to a logarithmic factor. The study highlights that while the error density is assumed known, extending the work to unknown error densities remains a challenge.

Key takeaway

For AI Researchers and Research Scientists working on nonparametric regression with dependent data, this work demonstrates that MEE-based deep neural networks offer a robust alternative to $L_2$ loss, particularly for non-Gaussian or heavy-tailed noise. You should consider implementing NPDNN or SPDNN estimators, as they achieve minimax optimal convergence rates for strongly mixing observations. However, be aware that the current theoretical framework assumes a known error density, which may require further research for practical applications with unknown error distributions.

Key insights

MEE-based deep neural networks achieve minimax optimal rates for nonparametric regression with dependent data.

Principles

Method

The approach minimizes Shannon's entropy of the error using deep neural networks. It defines NPDNN and SPDNN estimators, establishing excess risk bounds over Hölder and composition Hölder function classes for strongly mixing observations.

In practice

Topics

Best for: AI Researcher, AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.