 So what does the RELU network do with its inputs? Unless one of the RELU switches input signs, RELU networks are linear. Now, to the left of 0, RELU is linear. To the right of 0, RELU is linear. So as long as we don't switch, we are actually linear. So what does that mean? It means that within a region where none of the RELU switches, the partial derivative of the output after the input has to be constant, because everything is just linear in between them. So what is the form of these constant regions? It turns out that it's polytopes. It's because linear functions have linear inverses. So let us see that. It's really exciting. Instead of having this weird function that we cannot understand, what we really have is a function that is piecewise linear, where each of these linear regions is actually a polytope, which is something that we can meaningfully understand. So why don't you visualize it and have a look at how the constant gradient regions look like in this case?