 Being a data scientist is much more than simply using plug-and-play machine learning libraries You have to really understand what the algorithm is doing first and foremost And the way you do that is by gaining an understanding of the underlying maths To be a high-caliber data scientist, you need to understand the fundamental maths. That's just a brutal truth However, the maths you need is not PhD or master's level It's typically the maths you learn in your final years of high school or in your first few years of certain Undergrad degrees. So in this video, I'm going to explain to you the maths you actually need as a data scientist And some useful resources for how you learn it. Let's get into it Data science itself is quite a big field and it's also still not clearly defined exactly what a data scientist is So if you're a data scientist in one company, you may be doing completely different work to a data scientist in another company And what this means is that the maths required for different roles and different companies will ultimately vary However, there are a few key concepts and core ideas in mathematics that I think every data scientist should know and will likely cover Anything you'll get in certain job descriptions or any interviews you may have Now it's important to mention that in this video will cover things that are mainly used for junior or entry level positions It's not for people who want to be a machine learning researcher at open AI for example That's a completely different kettle of fish and personally I have no experience with that So I can't really speak on it. So in general, there are three topics you want to cover which are statistics and probability linear algebra and calculus Those are the three main mathematical fields that are used frequently by data scientists and the ones I use pretty much day to day these fields individually are ginormous and people dedicate their whole careers and Basically life to studying them. The goal of this video is not to You know learn everything in these areas But just the things you need to know and the core principles that you will use frequently as a data scientist Anyway, let's dive in into these key concepts and I'll let you know what you need to know about it Each section will be broken up into what to learn and how to learn so you can easily tailor your learning for your data science Row map if you have one in My opinion probably in statistics is probably the most important one out of the three as is the one you Frequently use the most and it's also the most applicable to the field Like I just said probability statistics is a very large field and a lot of active research going on But we could probably break down what you need to know in this area to five or six key principles Now the first one is descriptive statistics and this is all about Summarizing and getting some key information from the data So the things you should really learn are mean Median mode Variance standard deviation Quantiles just anything that summarizes data and also very useful to have some knowledge about how you visualize it So things like box and whisker plots bar charts line graphs pie charts, you know The list goes on but you get a picture just things that you would frequently use to show Stakeholders or people what your data really means under the hood The second one is to have good knowledge of some of the most common probability distributions So things like normal Poisson gamma Binomial these distributions come up a lot in data science and also very important for any EDA project You do and also any modeling because in modeling you got to know what distribution the data has so you can fit the correct Algorithm to it the third area is probability theory now machine learning even though it's called machine learning I love it comes from like a statistical learning theory And so if you know the probability theory very well, you will also understand machine learning very well The areas you should learn for this are basically things like maximum light estimation and Bayesian sticks these are just all encompassing statistics all knowledge that you should have as a data scientist The next one is hypothesis testing and confidence intervals So these come in handy for AB tests and AB tests are used pretty much everywhere in data science and Analysis and marketing and so it's a very useful skill to know Hypothesis testing is a form of just testing significance or if your result is statistically significant Like I said, it's used everywhere. So something you should really develop and understand The things you should like learn are basically z test t test Chi square test and what are confidence intervals? The final one is the idea of modeling and inference now like I said machine learning is pretty much based on statistics And so with that you get things such as linear regression and generalized linear models These are two quite famous modeling techniques that pretty much lay the bedrock of most of the machine learning algorithms So learning those will give you a really good foundation for anything else you do from there There are of course many other areas to explore within those sub-domains and if I listed out everything You need to know in statistics, it would be pretty exhaustive and to be honest It would be a very long video. However Wikipedia has got great article that gives you a link of the whole field So I really recommend you check that out if you want a really big overview of what statistics and probability looks like as a whole Spectrum of the domain probability in statistics is such a big field and so there's so many resources out there for you to learn from However, there are a couple that I really recommend and that basically just a one-stop shop for everything You probably would need to know in the field for data science The first one is the textbook practical statistics for data scientists Now I can't recommend this book enough because it's basically what we're after it's a statistics book Exactly designed for data scientists And so it will cover all the things we will need to know on a day-to-day basis and all the things I just listed in this video if you'd rather learn from a video format Then I really recommend free co-cams video on statistics. It's about eight to ten hours But it will give you a rundown of everything I just mentioned in this video They need to learn about statistics and probability Calculus is the heart of how machine learning algorithms actually learn the optimization process of Machine learning is done through calculus So it's really essential you understand the fundamentals and what calculus is really trying to do under the hood There are two main areas for calculus integration and differentiation, but let's break them down a bit further differentiation is all about Breaking something down into small pieces and see how it reacts to little changes or rates of change Now I appreciate that sentence may seem a bit arbitrary to you right now But I promise you watch us are studying and basically understand what differentiation is and make a lot more sense to you the things you should learn are Obviously, what is differentiation and what is derivative and what do they mean learn the derivatives of common functions things like sign cosine tan H The exponential and basically why are these derivatives the way they are What are turning points and why they are important and how do maxima and minima relate to them learn? differentiation operations such as Product and chain rule which are used a lot particularly for grading descent and back propagation Which are used pretty much in every machine learning algorithm Understand partial derivatives in their role in multivariate calculus again This is a life blood of grading descent So it's something you should really be familiar with and really understand be able to understand the difference between the convex and non-convex Function again, this is really important for optimization problems and being able to understand the solution You find is indeed the best one and finally make sure you learn about Hessian and Jacobian matrices These are used throughout deep learning. So I really recommend you learn them now. Let's move on to integration and Integration is arguably used less in data science than Differentiation, but it's still equally as important. So like with differentiation You should begin learning integration by basically understanding what it is what is purpose and how does it work? Learn the integration of common functions again side cosine the exponential of the common ones You should know different integration operations such as integration by substitution and integration by parts How do you do integration calculations for volume or areas and finally? There's concepts a bit more advanced and not really a requirement But understanding what Fourier series are and its applications now the way I recommend you learn this required calculus It's through a textbook mathematics for machine learning Arguably this book might be a bit overkill and it has a lot of more in-depth and quite advanced topics in there But if you really want to learn everything in that calculus section, then your knowledge of calculus will be excellent Likewise free co-camp has a calculus 101 course So feel free to watch that if you prefer video format for your learning And so the last topic is linear algebra Which is all about how matrices vectors and scalars work in linear spaces, right? So the things you should learn in linear algebra are vectors So what are vectors? How do you calculate the magnitude dog product angle orientation things like that? Just basically how vectors work and what they represent then learn about matrices now matrices are basically just big You know array structures You can think of if you come from a computer science context and they hold data So a data set with n features in the emeralds is basically an n by m Matrix so it's really common and really use a lot every day even if you don't think about it After you've learned about matrices. I'll then move on to their operations and transformations So things such as the trace transpose Determinance the inverse just how makes these work and some of that baseline Transformations that you come across frequently as a data scientist for example Finding a major sees eigenvalues and eigenvectors is the bedrock Understanding you need to really grasp what's happening in principal components analysis Which are very common algorithm use in industry and finally the last topic you should cover in linear algebra is systems of linear equations This is used primarily in optimization problems where you have multiple objectives and constraints and your goal is to find a Solution to those multiple equations you should learn techniques such as row reduction elimination and Kramer's rule to learn this linear algebra I again recommend the mathematics machine learning textbook. It's amazing and like I said probably goes into more detail than you need But it gives you a really good and solid understanding of the mathematical concepts in a really detailed way There is also an excellent Coursera course called linear algebra for data science and machine learning It basically does what it says on the tin and I really recommend you check that out If you like a core structure for your learning and finally like of all the other sections Free code can have got excellent video linear algebra if you prefer video format learning I know maths can be scary and some people may not be naturally good at it But like I said the maths is not an extraordinary high level It's yes can be kind of their entry level for most university courses But I truly believe most people can learn it give enough time and enough effort The three key areas you need to know are statistics linear algebra and calculus Having a really good understanding of these three fields will set you off for success later on in your career So I really recommend you invest time right now to learn them thoroughly and really understand what's happening under the hood of all of these areas If you enjoy this video and want to learn how you become a great data scientist Then I highly recommend you check out my newsletter Dish in the data every week I give tips and advice of how you become a better data scientist And learnings i'm having my own career that may benefit you I will link it in the description below in case you want to check it out Make sure you like comment and subscribe and i'll see you in the next one