# Metric entropy limits on recurrent neural network learning of linear dynamical systems

### Authors

Clemens Hutter, Recep Gül, and Helmut Bölcskei

### Reference

Applied and Computational Harmonic Analysis, 2022, to appear, (invited paper).

[BibTeX, LaTeX, and HTML Reference]

### Abstract

One of the most influential results in neural network theory is the universal approximation theorem [1, 2, 3] which states that continuous functions can be approximated to within arbitrary accuracy by single-hidden-layer feedforward neural networks. The purpose of this paper is to establish a result in this spirit for the approximation of general discrete-time linear dynamical systems—including time-varying systems—by recurrent neural networks (RNNs). For the subclass of linear time-invariant (LTI) systems, we devise a quantitative version of this statement. Specifically, measuring the complexity of the considered class of LTI systems through metric entropy according to [4], we show that RNNs can optimally learn—or identify in system-theory parlance—stable LTI systems. For LTI systems whose input-output relation is characterized through a difference equation, this means that RNNs can learn the difference equation from input-output traces in a metric-entropy optimal manner.

### Keywords

Recurrent neural networks, linear dynamical systems, metric entropy, Hardy spaces, universal approximation, system identification

In Definition 1.1, $$\mathcal{R}_{\Phi}: \ell_{\infty} \rightarrow \ell_{\infty}$$ can be replaced by the more general $$\mathcal{R}_{\Phi}: \ell_{\infty}(\mathbb{N}_{0}) \rightarrow \mathbb{R}^{\mathbb{N}_0}$$.
The networks constructed in the proofs of Lemma 2.2 and Theorem 2.3 are applicable to input signals $$x$$ with $$\lVert{x}\rVert_{\ell_\infty} \leq C$$, where $$C \in\mathbb{R}^+$$. Therefore, the sentence above equation (25) To this end, we first recall that RNNs according to Definition 1.1 accept input signals in $$\ell_\infty(\mathbb{N}_0)$$ and set $$C={\lVert x \rVert}_{\ell_\infty}$$. should be replaced by To this end, we first recall that RNNs according to Definition 1.1 accept input signals in $$\ell_\infty$$ and choose a $$C \in \mathbb{R}^+$$ such that $${\lVert x \rVert}_{\ell_\infty} \leq C$$. Similarly, $$C = {\lVert x \rVert}_{\ell_\infty}$$ below equation (37) should be replaced by $$C \in \mathbb{R}^+$$ is such that $${\lVert x \rVert}_{\ell_\infty} \leq C$$.