 In this video, I want to go over six key things or mistakes that data scientists often make when trying to break into the field. Successfully avoiding even one of these pitfalls will be a great success and you're doing a lot better than me than I did at the start. Let's get into it. Nowadays, most people don't think you need to know all the underlying maths behind data science and machine learning because the modern day packages abstract the need for us. You'll practically never have to implement back publication from scratch or build your own decision tree. It's easy to take this for granted and not learn any of the background theory behind these algorithms. However, this can be quite dangerous. Sure, you can build a neural network with a few lines of pie torch, but what happens when the predictions are wrong and you have to debug it? Or what happens if someone asks you what are the prediction intervals around your output from a linear regression model? These questions and scenarios come up a lot more than you think and the way you answer them is by having a solid grasp of the underpinning mathematics. I can sympathise that maths may not be ever on strong suit and it can be quite scary, but the maths needed for entry level data science roles or machine learning roles is not PhD or master's level. It's more the maths you get taught in your final years of school or even first year of university. In my previous video, I detailed the maths you need and some recommended courses. So I highly recommend you check out that video if you want to learn the online maths you need for data science. I often get asked what is the best course to take about a topic. In reality, as a complete beginner, the best course is the one that you choose and completes. Many intro to Python, data science, statistics courses will teach you the exact same things. Sure, there's different teachers and different teaching styles, but just find one you like and stick to it. In reality, you will learn the exact same things as someone else doing an intro level course. As a complete beginner, it's much more important to get started and not overthink too much. Bias towards action and getting going in the beginning, you can always tailor your direction later on if you're misaligned with where you want to go. As the famous saying goes, the best time to plant a tree was 20 years ago and the second best time is today. Along the same theme of courses, the other common pitfall I've seen data scientists make, particularly when I try to learn data science, is getting stuck in tutorial hell. Now, if you're not sure what tutorial hell is, it's basically when you're just doing tutorial and course after course, but you're not really branching out and trying to implement anything on your own. To learn concepts, you need to practice and implement them independently. If you're always taking courses to guide you, you're not really learning properly because you always have some sort of step-by-step process that you're following and that's very different to trying to solve a problem completely by yourself. Imagine that you've only ever fitted an XGBoost model through an online tutorial. Now, let's say you get a takeaway case study from a job interview and you're then trying to fit XGBoost model to this data. Well, in reality, you've never actually implemented an XGBoost model completely from scratch. You've always followed a step-by-step process and so this will restrict you in how well you can answer this problem and therefore affect you getting a job. Another major issue with tutorial hell is the learning opportunity cost it takes from you. By being stuck in tutorial hell, you could have learned so much more by trying out your own projects and your own implementations than simply doing another course on the same subject. The point is learn just enough and then branch out on your own. Don't do multiple TensorFlow tutorials, just do one, then try and build something. That's where the real learning is done. Like most people, I got into data science to learn about neural networks and deep learning. To be honest, it makes sense because a lot of the hype of AI revolves around these models. However, being a good data scientist is way more than simply being able to implement a neural network. In fact, I'll go as far as saying that for most data science and machine learning jobs, at least at the beginning, neural networks are probably one of the least important skills to have. Deep learning is actually a very small subset of machine learning and generative AI, which is responsible for all the hype around chat GPT, is actually even a smaller subset inside a deep learning space. And all of these things, machine learning, deep learning and gen AI, are all encompassed under their AI umbrella. AI is a general term for decision-making algorithms, however, this definition is still quite controversial and undefined properly. In industrial applications, your finite neural networks are typically the last choice, mainly because regular classical ML algorithms you find are better and are also way more interpretable than neural networks. Additionally, most machine learning algorithms are very different from neural networks. So by simply studying deep learning, you won't get an overall picture of what AI or the ML algorithms really look like. A prime example of this is gradient booster trees, which are the gold standard for tabular data. However, they couldn't be even more different from neural networks. Finally, chances are that for most entry-level data science roles, you're gonna ask you questions about linear regression, logistic regression or decision trees. So you might as well learn these basics before you move on to the cutting edge stuff. While doing projects is a great way to learn, don't oversaturate your GitHub for a lot of easy projects. If all your projects revolve around a pre-made data set and using psychics-learn.fit.predict methods, then it's time to try something a bit harder. Now, I'm not slating these entry-level projects as they are a great way to learn data science and get your hands dirty in the field. However, over time, I think it's important to manage quality as opposed to quantity, particularly when you've got a few projects under your belt. On the screen now will be a list of some ideas I recommend you try. These are obviously just a couple of examples, but there are so many out there. The point is just pick something that you think sounds interesting, but also quite difficult. So I really encourage you to stretch out your comfort zone. And like I said, this would look really good because having a card project that you can discuss and talk about is a lot better than having just 10 very simple projects that all look the same. And finally, the last mistake I see data scientists make is that they work solely on notebooks. Now, Google Collab and Jupyter Notebook are great tools and great ideas that are particularly useful for visualization and are very simple to use for beginners. However, in industry, data scientists work closely with software engineers and they deploy their algorithms for production. The productionizing of an algorithm is very difficult to do in Jupyter Notebook and requires steps such as linting, unit testing, and environment managers. Unfortunately, a lot of these things I just mentioned are very hard to do inside a Jupyter Notebook. That's why most data scientists in industry use IDEs such as PyCharm or VS Code. I highly recommend you try to implement your own algorithm using one of those IDEs I just mentioned. Try adding your own package requirements to this model and use some Git version control and even practice some bash or ZShell scripting as well. These are some of the most commonly used tools by professional data scientists and machine learning engineers. So it's well worth getting some practice in early. Not to mention, it looks really good for potential employers. Learning data science is very fun, but there's so many bumps and pitfalls to fall into. In this video, I discussed six of the most common ones and also the ones I fell victim to. I hope in this video that you will avoid at least one of these errors and if you do, you'll be doing a lot better than me than I did at the start. If you enjoyed this video and wanna see more content like this on this channel, then make sure you click the like and subscribe button and I'll see you in the next one.