 So that was fun, wasn't it? You all know from previous lectures that a linear system cannot solve the Excel problem and you just made it do it. It's of course possible because a violation of the abstraction relative to what's really going on here. And now you can say, well, does it matter? Isn't this mightily abstract that we have that? Yes, it does. Many deep learning failure modes are failure modes of the abstraction. For example, a small gradient may be exactly a zero and your network will never alarm. Also, e to the power of 1000 is infinity. So if your neural network has, say, an exponential then all of a sudden you have infinity. So all gradients vanish at that case. In fact, they become undefined. Now like a lot of functions that we'll be using during the course will use softmax. The softmax of z is e to the zi divided by sum of j is one to k of e to the zj. Now, what happens if e is large? Well, it shouldn't produce any problems like gradients might become zero or one. There's no reason why anything should go wrong here. But the gradient in practice will be infinity. It will be not a number, sorry. The reason why it will be not a number is because we have infinity divided by infinity plus a couple of sums. It's entirely undefined in that case. It is not undefined mathematically. This is a perfectly well-defined gradient. It's just that in the abstraction that we use with numbers represented by how many digits we have it looks like it's infinity. I have seen students spend weeks debugging this very problem. Oh, I'm getting not the numbers. Well, I don't know where are they coming from? Not the numbers are impossible because we know that the gradient when we write it out by hand is always well-defined. Well, it's not in the abstractions that we have. And when you're debugging artificial neural networks it's often useful to be mindful that the abstraction that we write on paper is not what is actually happening on the machine. Now, let us get back to linear neural networks and do something completely different. What we'll do is we'll now do deep linear artificial neural networks and just see how well that works. We take a 10-layer neural artificial neural network. We will not make it non-linear, not like the previous exercise, not on purpose making it non-linear. We'll leave it in the domain where the abstraction is good. And we know that it can implement the same function as like the one-layer neural network that we used before. Okay, now it's your job to show that it can implement the same function. Please come back in 10 minutes, Max.