A mathematical theory of deep convolutional neural networks for feature extraction

Authors

Thomas Wiatowski

Reference

Deep Learning: Theory and Practice, Workshop organized by the Max Planck Institute for Intelligent Systems, Donaueschingen, Germany, July 2016, (invited talk).

[BibTeX, LaTeX, and HTML Reference]

Abstract

Deep convolutional neural networks have led to breakthrough results in numerous machine learning tasks that require feature extraction, yet a comprehensive mathematical theory explaining this success seems distant. The mathematical analysis of deep neural networks for feature extraction was initiated by Mallat, who considered so-called scattering networks based on the wavelet transform and modulus non-linearities. In this talk, we show how Mallat’s theory can be developed further by allowing for general semi-discrete shift-invariant frames (including Weyl-Heisenberg, curvelet, shearlet, ridgelet, and wavelet frames) and general Lipschitz-continuous non-linearities (e.g., rectified linear units, shifted logistic sigmoids, hyperbolic tangents, and modulus functions), as well as pooling through subsampling. For the resulting feature extractor, we prove deformation stability for a large class of deformations, establish a new translation invariance result which is of vertical nature in the sense of the network depth determining the amount of invariance, and show energy conservation under certain technical conditions. On a conceptual level our results establish that deformation stability, vertical translation invariance, and to a certain degree also energy conservation are guaranteed by the network structure per se rather than the specific convolution kernels and non-linearities.

This publication is currently not available for download.