IEEE ICASSP 2020 Tutorial
Monday, 04 May 2020


Location information: Virtual Meeting. 
9:009:20  Introduction & Motivation 
9:2010:30  Neural Network Compression + Q&A 
10:3011:00  Coffee Break 
11:0012:00  Distributed Learning 
12:0012:30  Code Demo + Q&A 
Deep neural networks have recently demonstrated their incredible ability to solve complex tasks. Today's models are trained on Millions of examples using powerful GPU cards and are able to reliably annotate images, translate text, understand spoken language or play strategic games such as chess or go. Furthermore, deep learning will also be integral part of many future technologies, e.g., autonomous driving, Internet of Things (IoT) or 5G networks. Especially with the advent of IoT, the number of intelligent devices has rapidly grown in the last couple of years. Many of these devices are equipped with sensors that allow them to collect and process data at unprecedented scales. This opens up unique opportunities for deep learning methods.
However, these new applications come with a number of additional constraints and requirements, which limit the outofthebox use of current models.
1. Embedded devices, IoT gadgets and smartphones have limited memory & storage capacities and restricted energy resources. Deep neural networks such as VGG16 require over 500 MB for storing the parameters and up to 15 gigaoperations for performing a single forward pass. It is clear that such models in their current (uncompressed) form can not be used ondevice.
2. Training data is often distributed over devices and can not simply be collected at a central server due to privacy issues or limited resources (bandwidth). Since a local training of the model with only few data points is often not promising, new collaborative training schemes are needed to bring the power of deep learning to these distributed applications.
This tutorial will discuss recently proposed techniques to tackle these two problems. We will start with a brief introduction to deep learning, it’s current use and the limitations of today’s models with respect to computational & memory complexity, energy efficiency and in distributed settings. We will stress the practical need to tackle these problems and discuss the recent developments towards this goal, including the emerging standardization activities by the ITU ML5G and MPEG AHG CNNMCD.
Then we will move on to the topic of neural network compression. We will start with a brief introduction of the basics concepts from source coding and information theory, including ratedistortion theory, quantization, entropy coding and the minimum description length principle. These concepts are needed to formalize the neural network compression problem. We will then move on to the discussion of specific techniques for compressing DNNs. For that we will differentiate between different steps of the compression process, namely pruning & sparsification, quantization and entropy coding. The first two steps are lossy, whereas the last step is lossless. Since size reduction is not the only goal of neural network compression (e.g., fast inference, energy efficiency are other goals), we will also discuss approaches to efficient inference, including recently proposed neural network format. We will finish this part with a presentation of a use case, namely ondevice speech recognition, showing how to make use of compression methods in practical applications.
After the Q&A and the coffee break we will present the recent developments in distributed learning. We present different distributed training scenarios and compare them with respect to their communication characteristics. We then focus on Federated Learning for the rest of the talk. We enumerate existing challenges in federated learning  communication efficiency, data heterogeneity, privacy, personalization, robustness  and present solutions to these challenges that have been proposed in the litterature. We specifically focus on techniques proposed for reducing the communication overhead in distributed learning and discuss clustered FL, a new approach to modelagnostic distributed multitask optimization. Here we will stress the similarity to concepts introduce in the first part of the tutorial, namely sparsification, quantization, encoding.
We will conclude the tutorial with a Q&A session.
For background material on the topic, see our reading list.
1. Introduction
 Current use of deep learning
 Practical limitations of current models and new applications
 Recent developments in research, industry & standardization
2. Neural Network Compression
 Background: Source Coding, Information Theory
 Pruning & Sparsification Methods
 Quantization & Fixed Point Inference
 Neural Network Formats
 Use Case Study: OnDevice Speech Recognition
3. Questions
4. Coffee Break
5. Distributed Learning
 Background: SGD, Learning Theory
 Basic Concepts of Federated and Distributed Learning
 Reducing Communication Overhead & Connection to NN Compression
 Federated Learning & Differential Privacy
 Clustered Federated Learning
7. Questions
Wojciech Samek  Felix Sattler 
Fraunhofer Heinrich Hertz Institute  Fraunhofer Heinrich Hertz Institute 