 The simple and infinite joy of mathematical statistics. So when I saw this on Amazon, I was very curious and thought, hey, let's just give this a shot. I wonder if this brings anything new to the table. I was pleasantly surprised. I can summarize the value of this book in one word, context. This book does so much to help the reader understand what is going on in the math stats world. In the introduction, it talks about how this book has been developed over many years to a diverse student body. It is very polished and does such a great job of talking about probability and math stats and why it's useful. This book has more color than your average math stats book. The little extra boxes and highlighting of words draw your attention. But don't let the colorfulness fool you. Just because it's colorful and has a nice font and big boxes doesn't mean that it's sacrificing in terms of rigor. This is a very thick book and takes a lot of pages to explain some of the difficult concepts. If this is not the math stats book that you're using in your graduate program, I highly recommend it as a primary companion. I've talked about in previous videos that Kasella and Berger is the standard. There's even a release date for a new edition of Kasella and Berger. I credit a lot of the reason I was able to get through Kasella and Berger to my professor. If I had to do it on my own, I would have really appreciated having a book more like this to help me get through it. Here's an example of a context box that just helped lay the land a little bit. The author talks about how this is a PDF-centric book versus a CDF-centric book. This is the kind of thing that you just miss in a short video or blog post. You might get the impression that there is only one way to think about things. Even in my own videos, I have to condense the content to deliver something that is bite-sized. But in a book like this, there is space to get into those crevices. Just having knowledge that there is a PDF-centric and a CDF-centric approach to math stats is very useful. You start thinking about things from a broader perspective. Maybe you start asking yourself questions. What would a CDF-centric approach even look like? And why is that something that makes sense in advanced math stats versus an introductory textbook? This box on page 32 is an example of how this book helps the reader. It defines what expectation is and how mu is commonly used as a symbol for expectation. But when you're talking about expectation of different random variables, there's a subscript. If you look at the expectation of x, you'll see mu sub x. If it's an expectation of y, you'll see mu sub y. This idea is in every math stats book. It's just nice to have it called out intentionally. Here's another thing I really like in this book. And this is something I haven't seen in other math stats books. The author puts the indicator function front and center. I feel like it's a good idea to get readers comfortable with the indicator function early on because indicator functions are things that come up often. I feel like in other books, I had to see a capital I dropped with how much context on what that capital I means. So this is just one of those things that you can be unfamiliar with. Sometimes an indicator looks different in different books. This book points that out and shows some alternatives. I like that the author does a great job talking about joint PDF and some of the misconceptions students get wrapped up in when they're talking about joint PDF. I think the visuals that are given are great. I wouldn't say that they're anything new. You might see these in a good probability book. I like that there's a lot of pages dedicated to figures and pictures and formulas. It makes it so you can absorb information with multiple media in mind. So far, everything I've shown has just been preliminary chapter content, just to set the stage for chapter one. And I'll say it again. This book is all about context. In chapter one, the heading, wait, where are we going? It's such a good title for this chapter. It gets your mind primed for what's going to be covered in the chapter. You start asking questions like, what kind of tools are going to be helpful to solve problems? What is the purpose of mass stats anyways? This is something very different from other books. So this book isn't perfect. Some may want to have some code in their mass stats books. And in this book, there is zero code. And even I sometimes think it's helpful, though sometimes there's comfort and code. Or when you see a program and you can see the output, it makes you feel a little better. This book just doesn't have it. There is no code. And mathematical ideas. And that's okay. There's other more advanced books that cover statistics and also have code. Something I liked in chapter one was an early discussion on order statistics. This came up a lot later in the Kassela and Berger book. I also remember another mass stats book I really liked called In All Likelihood. And I seem to remember order statistics also being pushed a lot further back in that book as well. But I think it's super interesting to put order statistics in the first chapter. Lots of times I feel like there's so much focus on means and medians. And we've all been hearing about means and medians since elementary school. But the idea that you can also model minimums and maximums, that's something different. It's just a different flavor. I appreciate that this was in there early on. Okay, if you're wrong, you're definitely going to hit order statistics in any mass stats book. But why wait so long to get to something interesting? The author also talks about moment generating functions. This is something everyone who's had a mass stats class is aware of. Moment generating functions are a way to represent probability distributions. And this isn't anything new. Maybe the only thing that's new is that it's introduced pretty early on. Which I like because it serves as a tool that you can use as you learn more and start playing with more distributions. I think it's really useful to start thinking about as you start building your portfolio or repertoire of probability distributions. At the end of each chapter, there are some exercises. The handful I picked weren't too difficult. I think some of the reasons that they weren't so difficult is because there was so much effort put into talking about the notation and the meaning of the mathematical symbols. If you're not greeted with brand new symbols when you get to the exercises, the author spent a lot of time talking about the notation and doing a lot of hand holding. So by the time you get to the exercises, it all makes sense. Chapter two goes into estimators. And I like the thought that was put into the title. In some books, you might be greeted with the title estimators. And so you might be thinking, I'm not even sure what an estimator is and why we have a whole chapter dedicated to it. But sure, let's talk about estimators. But in this book, just having this colon and this little context after the title saying defining good, better, and best, that's a fantastic title to get your mind going. So already you're thinking, oh, there's this thing called an estimator. And some of them are better than others. What does it mean to have a better or worse estimator? Are there any ways to always make sure I get the best estimator? So those little things make a big difference in framing the discussion for the upcoming chapter. I'll point out that this note on page 128 is another note I really like. It says, note that x and x bar is capitalized. And at this point, we are considering it as a random variable. We are going, in the future, to sample values of x1, x2, xn. From this normal distribution, then we are going to average them. Once we do, we will have observations, lowercase x1, lowercase x2, lowercase xn, which have sample mean x bar. While mu is an unknown constant, mu hat is a random variable. It does not make sense to say that mu equals capital x bar. But it does make sense to say mu hat equals capital x bar. The word estimator is used to refer to the random variable x bar, while the word estimate is used to refer to the actual observation, lowercase x bar. So what the author is doing is making a distinction between all of these symbols. These symbols are all very different things. Even though in English, they might seem similar, where one is just a lowercase version and one is an uppercase version. This is not the case in mass stats. That is information that is sometimes missed. You know, lots of times you would rely on a professor to point this out. But I'm so grateful that the author made a full paragraph just highlighting this idea. All right. So that was just a taste of this book. I'm going to do a little skipping and just talk about a couple more ideas here. Chapter six is on general hypothesis testing. At the very beginning is language notation. I still remember being in my second semester of mass stats and opening up the book and seeing a lot of fat sigmas in the hypothesis testing section. I was still not too comfortable with the notation and set notation. Even though it was covered in the first semester, it just wasn't second nature yet. I mean, look at this formula on page 403. It starts off with a mix of mathematical symbols and some English language, like type one error, reject not when the parameter is theta, reject h not, colon theta, and max sub theta in theta not. I mean, people look at this and they think, well, this looks weak to me. And it is literally Greek. I just think that sometimes it's easy to forget that when you're a professor and this is your world, students are looking at a page full of Greek. So it's nice to see that she takes some time to talk about what the backslash means. And that the backslash here is a set minus notation. So with some dedicated sentences focusing on some set notation, it can help you get mentally ready to start following along with general hypothesis testing. Now on the topic of hypothesis testing, I'm going to get sidetracked here for just a second. I remember reading a forum. The post was by a professor who was critical of how statistics was being taught at the graduate level. This was probably 10 years ago. And he was saying something to the effect of, why in the world are we spending so much of our time teaching graduate students ideas about best estimators and significance testing? We're always spending a large chunk of time teaching things like most powerful tests of size alpha when we could be teaching them advanced learning algorithms. And as a student, I remember thinking that makes a lot of sense. I was on the same page. Why aren't we learning some of the machine learning models everyone else is learning in computer science? But my perspective has changed being in the industry for some time now. This is a skill set that is missing. At this point in time, there are so many learning algorithms. This includes pattern recognition algorithms, machine learning algorithms, L layer neural networks. All the algorithms that were hot 10 years ago are libraries now and may even be old at this point. Maybe there's better alternatives. And you just import the library and pop your data in and after some preparation get something out. This part of work is like pushing a pedal on a car. You just push the pedal and it goes. There is a lot of technical detail in how everything works under the hood but that's been worked out pretty well. But what's important is critical thinking skills. And this is where the hypothesis test comes into play. Hypothesis drive decisions. Why wouldn't any scientist want the tools necessary to think about the world in terms of hypotheses? Hypothesis are what drives discovery and decision making. Now the hypotheses could be tests for if a medication is really having an impact on someone's overall health or if a diet or exercise plan or certain policy is having an impact. It's fundamental to progress and hypotheses only become more and more important. So why not master the craft of the hypothesis test? Whether you're aware of it or not, central to making decisions is a hypothesis. In my mind, mastering the hypothesis test is mastering decision making, real world decision making. So that's my soapbox. So I don't wanna give too much more away about this book. So I'll end it here. To recap, this book is for two groups. One group is the self learner, someone who may be struggling to understand mass stats and doesn't have a professor to help dig into the content or expound on it. And the second group is the student who just needs a companion to a more rigorous statistics book. I hope you enjoy the book. Thanks for watching.