 Hey, so some time ago. I made a video on bootstrapping bagging and random force But I'm not sure if my explanation was good enough at the time I did give you some code examples, but I'm gonna try to go deeper and try to actually understand what random force to do So let's get to that. Here's a small prerequisite list that I put together To understand random force we need an understanding of bagged decision trees This in turn requires working knowledge of decision trees and the bootstrap technique Let's walk through this pipeline starting with the pre-rex Bootstrap in one line. It's subsampling with replacement Consider 100 sample integers. You have one goal find the mean Normally we would just sum 100 of these samples and divide the value by the total number of samples 100 easy enough However with bootstrapping we pick a subsample size say some 30 samples We find the mean of these 30 samples and call it mu1 Now put these 30 samples back into the pile and pick another random 30 samples and find that mean Let's say it's mu2 We keep repeating this sampling with replacement for say 20 times and we end up with 20 means in The end we take the mean of these means and that is the final answer the bootstrapped mean This technique can be used on any other statistic like median We can find 20 medians with subsampling and finally take the mean of these medians as the bootstrapped median So here's a question why use the bootstrapped mean over the normal mean It's because the average over an average decreases the variance of the predicted mean This technique can be used in machine learning algorithms, which I'll talk about during bagging from the perspective of machine learning Decreasing the variance of a prediction will reduce overfitting and hence it is a very desirable property Now let's talk about decision trees These are tree data constructs that take an input sample at its root Ask a sample series of questions at non leaf nodes and output a value at leaf nodes Depending on the nature of this value. We have different types of decision trees in case of categorical outputs We have a classification tree for example given a person as a sample Determine if he or she is sick or not In case of continuous value outputs, we have regression trees an example is given a person as an input sample We determine the probability that this person is sick the probability value being continuous classification regression trees or cart models are machine learning methods for constructing predictive models from data. I Know this is brief, but I think I'll cover more in decision trees in a separate video This knowledge of decision trees is enough to understand random forests Now moving on to bagging bagging is bootstrap aggregation It involves the application of the bootstrap to high variance machine learning algorithms High variance a model exhibits high variance if it yields different results when trained with different samples Clearly it isn't a desired property a common example of a high variance model is decision tree different samples fed during training can change the structure of the tree drastically and A way to decrease the variance of this model is through bagging and hence we use bagged decision trees Here's how it works Consider a thousand samples say we want to train some bee classifiers For each of these bee decision trees. We split the data set with replacement Say each classifier gets 300 samples Now construct the bee decision trees using their corresponding samples Don't worry about the depth of the tree. Let it go as deep as it needs to to classify these samples correctly when testing a sample we pass it to all bee decision trees and majority rules or in the case of regressed values, it just takes the mean of all the decision tree outputs So here are some questions Why are we not pruning individual decision trees? Won't they exhibit a high variance? Well, yes. Yes, they will But we don't care because the results are going to be aggregated anyways and this aggregation will lead to the decrease in variance Here's the second question Why use bagging in a high variance machine learning algorithms? Well for high variance machine learning algorithms, the bee decision trees will have different structures and hence will predict differently Hence an aggregate prediction would decrease the variance of prediction However, if we apply this method to low variance machine learning algorithms like logistic regression Then all the bee classifiers would look the same hence all of them would also predict more or less the same output So if that's the case then why bother with the computation overhead of training multiple classifiers when one classifier does the trick, right? Now let's go to random forests The problem with bagged decision trees is that each tree node can be split based on the same P features This could lead to structural similarities between the trees and hence correlation However, we want our models to be as different and decorrelated as possible Random forests are thus like bagged decision trees But every tree can only split based on a subset of features M This subset of features is different for each classifier For classification trees with P features every tree should be able to split on about square root P features For aggression trees each tree should be able to split on P divided by three features These aren't hard-coded values. It's just a heuristic While training decision trees, there are samples that would have been left out For example, if the population has 1,000 samples and we train each decision tree with 600 of them picked with replacement Then each tree still has 400 samples that they haven't seen these samples are called the out-of-bag samples OOB Such out-of-bag samples can be used to validate the random forest classifier Now here's a quick definition to end the video. What is an ensemble method? It's a technique that combines the prediction from multiple machine learning algorithms to make predictions that are more accurate than individual classifiers and we've seen this throughout the video for example bagged decision trees or the random forest classifier and even boosting and That's all I have for you now So if you like the video hit that like button if you like what I do on this channel if you like AI machine learning deep learning or data Sciences then hit that subscribe button bring that bell for notifications when I upload I write stuff on Quora So you can follow me on Quora link is down in the description below Still haven't gotten your daily dose of AI click or tap one of the videos right here for an awesome video And I will see you in the next one. Bye