 In the introduction, I noticed a pattern which I said we would explore this week. And we will actually read a whole piece of paper on it later. But for now, I want to just notice that and remind you that deep learning often uses more parameters than observations. It should massively overfit, but deep learning experiments actually show that you can get zero training error and small test error, and that you could even get zero training error with randomized labels. And for now, I want you just to explore that, see what happens. We'll think more later about why that's the case.