 Hello and welcome to the course data compression with deep probabilistic models. My name is Robert Bambler. I'm a professor of data science and machine learning at the University of Tübingen in Germany. And one of the focus areas of my research lies in developing and understanding new machine learning based compression methods. So I'm very excited about the opportunity to teach a course on precisely this topic. This video is part of a YouTube playlist that will follow along with the entire course. You'll find a link to the playlist in the video description. If you're taking the course as a student at University of Tübingen, then I hope that these videos will help you to recap what you've learned in the lecture. If you're not a student at University of Tübingen, then you can still watch these videos to get an idea of the lecture contents. I'll try to keep this video self-contained, but you'll obviously miss out on the interactive parts of the classroom sessions. Alongside with these videos, I'll provide additional course materials at the URL that you can see at the bottom of the screen. You can also find a clickable link to this website in the video description. These materials include problem sets with solutions and with code examples. And these materials are all publicly accessible, so you don't need to sign up to the course to download them. In this first video that you're currently watching, I'll give a very brief overview of the things that we'll learn in this lecture and the subsequent videos will then dive deeper into the topics. Before we get into the course topics, let's briefly ask ourselves why data compression is so important. The amount of data that the world is producing and communicating grows very rapidly. Back in 2015 Cisco projected that internet traffic will grow about 27% each year. This was a projection and we can now see that internet traffic grew even faster than that projected. Part of the reason was likely the COVID-19 pandemic where a lot of the world's very suddenly adopted video conferencing is a standard way of conducting business. Here's some data that was published by Cloudflare who run one of the biggest content delivery networks and they saw about 40% growth in their traffic over the first four months of the year 2020 alone. Now if we want to cope with this enormous growth of internet traffic, we have to use effective data compression and it's getting increasingly evident that machine learning methods will lead the way to a really new generation of compression codecs with significantly better performance. More concretely, I think there are two areas where machine learning based compression methods will have a real impact in the next decade or so. The first area is from compressing common types of data and in particular video data. Video makes up about 70% of consumer internet traffic so even a tiny improvement in compression performance in this area can release a significant amount of load from the global communication networks. And we're at the very exciting inflection point in this area because right now machine learning based methods are currently just at the brink of outperforming classical handcrafted compression codecs and while progressing classical compression codecs getting into a dream of diminishing returns I think that machine learning based compression methods have a lot more room to improve. So you may not have heard a lot about machine learning based compression methods in the past but I think this is going to change soon. Just as an example of a product that's already out there Nvidia announced last year that they are now offering a machine learning based compression codec for real-time video communication and they saw that they can save a lot of bandwidth with this. Now that's the second area where I think machine learning can lead the way to a new generation of compression codecs and that's for highly specialized data types. I think this area may be particular of interest to natural scientists like particle physicists or neurobiologists. When you think about it some natural scientists really generate huge amount of data that has to be stored somewhere and often the storage is really one of the limiting factors so scientists often have to throw away measurements simply because it would exceed any reasonable storage capacity. Now when you compare these kind of scientific data types with the more common data types like video you'll see that very different kinds of economic incentives are at play. Effective video compression is so important to so many big companies that they have really invested a lot of money into decades of research for highly optimized hand tuned classical compression methods. And this is something that we cannot afford for specialized scientific data types like particle collider data. So for these large scientific data sets we often don't have a good compression method at all. I believe that machine learning based compression methods can play a really transformative role in this area and they can allow us to develop new specialized compression methods much faster because we can now automate parts of the process. So I hope I've convinced you that machine learning based data compression is an exciting and impactful field of research. I'll now give you a brief overview of the topics that you learn in this course. If you're watching this video on YouTube you've probably noticed that there's already a lot of high quality video material out there both on the topics of data compression and on machine learning. What sets this course apart is that we set ourselves a very specific goal of using machine learning for data compression and we'll cover the necessary knowledge for this goal on the entire spectrum from data compression to statistical machine learning and also the entire spectrum from fundamental theory to practical applications. So on the vertical axis here you'll learn about both compression mechanisms as well as about deep learning methods that allow you to model very complex data sources and then importantly you'll learn how the two interact with each other. So for example you'll learn why certain compression mechanisms are compatible or incompatible with certain types of machine learning models. And on this horizontal axis from theory to applications you'll learn both about the foundations of communication about theoretical bounds for the bit rate of lossless and lossy data compression and about the theory of probabilistic models and of what we call Bayesian inference and we will really derive mathematical proofs for rigorous theorems in all of these areas. But on the other end of the spectrum we'll also take this knowledge that we gained from the theoretical considerations and we'll apply them to develop practical compression algorithms. So in these videos I will present various compression algorithms and I will provide problem sets with solutions and with code examples where you can learn how to implement highly effective compression algorithms in practice. And finally you'll also learn how you can combine these compression methods with machine learning techniques to come up with new compression codecs and you'll learn what some of the important open research questions are. To make this more concrete, this slide locates some of the topics that we'll discuss in the upcoming videos on these spectrums from data compression to statistical machine learning and from fundamental theory to practical applications. I'm not going to go into details for each of these topics right now but you may pause the video here to have a closer look or you may want to refer back to this slide once you've watched some of the subsequent videos and then to get a better understanding of how the things you've just learned fit into the bigger picture. So who is this course for? If you want to take something away from this course then you should have some solid understanding of multivariate calculus. Knowledge of probability theory will certainly be helpful but I'll provide a brief pragmatic introduction as part of the course for those of you who are new to this field. As I've mentioned already, part of the course will be about proving important mathematical theorems so you should have some interest in understanding such proofs. On the applied side, we will implement most of our examples in Python. I'll discuss some code examples in the videos and there will also be at least one programming exercise on every problem set. So as I've mentioned, you'll find a link to the course website that has all the problem sets and code examples in the video description. I really can't stress enough how important it is that you follow along with code examples and that you really do the programming exercises because compression algorithms can be very subtle and you'll see that you'll probably often think that you've understood an algorithm but once you try to implement it, you realize that reality is really different. The programming exercises shouldn't be too much work because they will mostly be kind of fill-in-the-blank types of problems. So you'll get a Jupyter notebook where most of the algorithm is already implemented and you only have to fill in some key steps that were left out. Aside from Python, it will also be helpful if you have some experience with a compiled, more systems-level programming language because you'll quickly see that scripting languages are generally not a good fit for the more fundamental part of compression algorithms. Most of our examples will be in Python though. Coming to the machine learning part, if you want to fully take advantage of this course you should have some experience with training deep learning models with some of the common frameworks like TensorFlow, PyTorch, or Jax. This is something that I can't really teach you in a single session of the course because it really requires some experiments. Finally, on the data compression side, I don't expect any prior knowledge from you. We will discuss everything you need to know in class and I'll summarize it in these videos. So how is this course structured? I've divided the content into 15 sessions. We're currently in session zero, which is more of an overview and which doesn't cover any specific content yet. In the next session, we'll directly dive into our first class of compression methods and we'll discuss so-called symbol codes. This discussion will allow us to both prove theoretical bounds for lossless compression and also to introduce and understand a first practical lossless compression algorithm that's called Huffman coding and that's widely used today. We'll later see however that Huffman coding is generally not a good fit for machine learning based compression methods so you'll also learn about more advanced compression algorithms, so-called stream codes. When we discuss these compression algorithms, you'll see that they all rely on a probabilistic model of the data source. So therefore, in sessions two to five, we'll take a step back from compression algorithms and we'll discuss machine learning techniques for describing complex probabilistic models. In these sessions, I'll provide a pragmatic introduction to topics like scalable basin inference and deep probabilistic models such as variational order encoders. If you don't know what these terms mean, don't worry, you'll learn about them in the sessions. In the last third of the course, we'll then put it all together and we'll discuss the interplay between compression algorithms and deep probabilistic machine learning models. We'll also discuss open, unresolved research questions in this area. To give you a better idea about current research in this field of machine learning-based compression, I'll then invite two pioneers in this field to give guest lectures about their latest research. We'll have Dr. Christopher Schroers from Disney Research Zurich represent industrial research and Professor Stefan Mann from University of California at Irvine represent academic research. If you're following this course on YouTube, I should warn you that the industry talk will not be recorded for legal reasons. Finally, we'll conclude the course by presentations from you, the students in this course. You'll all work on individual research questions in small groups throughout the term and you'll present your results in the last week of the term. These presentations will also not be recorded. Before I wrap up the summary, let me recommend to you some additional learning resources. I'm not aware of any textbooks that cover machine learning-based data compression in particular but I can recommend that two books listed on the slide for the separate topics of data compression and for probabilistic machine learning respectively. The first book is actually lecture notes from a lecture by David McKay and that lecture is also available on YouTube. I can also recommend the videos on compression algorithms by the YouTube user who calls themselves mathematical mock. Finally, if you're a student at University of Tübingen I highly recommend the lecture probabilistic machine learning by Professor Hennig which is also available on YouTube. You can find clickable links to all of the URLs on the slide in the video description. Alright, I hope I've made you curious to learn about machine learning-based data compression so please click on the link for the next video in the playlist to dive right into it.