 Hey guys in this video I'm gonna give you five tips for data science noobs now five is just a great number because it's long enough for me to explain stuff It's not too many points that I'll just like be listing off like a few lines and you have the attention span of a squirrel So I think five is a great number just a disclaimer. I am not a complete data science pro I still learn things every day, and I just wanted to share whatever I know with you Number one don't be afraid to explore your data Now before even starting your analysis You need to have a question or a set of questions in mind that you want to get out of your analysis But don't let this define exactly how you understand your data in order to perform an analysis you need to know the kind of data you have and Well in the case of a data frame type data what every field represents Playing around with some parts of data may give you new ideas on actually how to solve a problem Just throw some summary statistics get some ballpark numbers and don't worry about the question that you're trying to answer that much Number two know your problem and how to approach it You're heading into a mountain of data, and you have a specific goal in mind Don't lose sight of this goal I said before that it is fine when you're exploring your data to get an understanding and not really consider your problem But when you actually have an understanding of your data make sure that you don't stray away from your question This could lead to a number of random shallow analysis that really won't be useful to anyone Number three machine learning is not always the answer First off even before thinking of modeling your data. You need to have a clear understanding of it Creating a model without understanding your data is just blasphemous and even after analyzing your data There is still no guarantee that you really need to throw a model at it I see this a lot in some Kaggle kernels where you would notice a response variable and see oh, it's a categorical variable So I might as well throw some logistic regression or a support vector machine classifier to it and see how it works Or in the case of regression, let me throw like a neural network aggressor and see how that works If you are thinking of machine learning, then you need to answer questions like what are you trying to model and What do you wish the model to achieve in the end? If you don't have a purpose for the model then don't bother modeling your data Number four statistics versus programming which one is better? Which one is more important? They're both important in stats I find myself using a lot of hypothesis testing Especially when I want to establish a comparison between two groups and want to determine if the difference is statistically Significant on the programming front. I feel like there needs to be a basic understanding of how to program in general Many libraries use an analysis are built into languages For example, Python has pandas for data frame manipulation or map potlib and seaborne for data visualization And even scikit-learn and tensorflow for machine learning Also, knowing how to program allows you to play with under documented libraries, too In five value presentation, this is probably one of the most important points When you perform an analysis, about 70% of what you do will be nothing out of the ordinary Some of them may not even be worth Presenting. Furthermore, when you include every little detail of everything that you've analyzed It's really hard to see what the key takeaways are So once you're actually done with your analysis, you should go back and Try to see the importance of everything that you've analyzed Highlighting things that are more important or that should stand out I find it particularly difficult to read Jupyter notebooks on the fly because Jupyter notebooks have code along with figures and some explanations just jutted in between So it becomes really hard to see what's important and what's not. I would recommend creating a set of Presentation slides that only have the key graphs along with their takeaways and any other Explanatory information that's required. I would put it in the appendix section So that it could be pulled up whenever you need some more information and that's my five points So here's a brief up Understand the kind of data you're dealing with by throwing up some big stats and don't be afraid to explore Second have a problem to solve and focus your analysis on solving that problem Don't do a bunch of random analysis that isn't going to be useful to anyone Third after analysis if you feel the need to model data that will help improve your problem Then go for it or if you're trying to solve another problem then do that Just make sure that the model has a purpose For have an understanding of stats and programming analysis and its implementation in code will be a lot easier And finally you should effectively be able to communicate your findings succinctly highlighting key points in little time And that's all I have for you now So if you like the video hit that like button if you're new here Welcome and hit that subscribe button ring that bell for notifications when I upload check on my other links in the description Down below still haven't had your daily dose of AI then click or type one of the videos right here for an awesome video And I will see you in the next one. Bye