StrassenNets: Deep learning with a multiplication budget

Authors

Michael Tschannen, Aran Khanna, and Anima Anandkumar

Reference

Proc. of International Conference on Machine Learning (ICML), pp. 4992–5001, July 2018.

Abstract

A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consist of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of matrix multiplications in DNN layers by casting matrix multiplications as 2-layer sum-product networks (SPNs) (arithmetic circuits) and learning their (ternery) edge weights from data. The SPNs disentangle multiplication and addition operations and enable us to impose a budget on the number of multiplication operations. Combining our method with knowledge distillation and applying it image classification DNNs (trained on ImageNet) and language modeling DNNs (using LSTMs), we obtain a first-of-a-kind reduction in number of multiplications (over 99.5%) while maintaining the predictive performance of the full precision models.Finally, we demonstrate that the proposed framework is able to rediscover Strassen's matrix multiplication algorithm, learning to multiply 2x2 matrices using only 7 multiplications instead of 8.

Keywords

deep neural network, compression, Strassen algorithm

Download this document:

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.