 Hello everyone, this is Alice Gao. In this video, I'm going to do a somewhat meta analysis. So in our short machine learning unit, we spent two weeks mainly talking about two kinds of models or algorithms, one is decision trees and one is neural networks. A really interesting but also difficult question to answer in machine learning is usually I have this particular data set. I think the function, the true function should look something like this. Which model should I use? Which algorithm should I use? Let's think a little bit about this question. First of all, I'll talk about some pros and cons of using a neural network. Then we'll compare and contrast decision trees and neural networks. Here are some situations where it's good to use a neural network. First of all, when we have really complex data, where the data has a lot of dimensions, where we have a lot of real valid inputs, or the data is very noisy. It comes from some sensors that are really noisy. So some examples of this kind of data you should be aware of from the latest success stories of neural networks, like images, videos, audios, things like that. Another good example where we should use neural network is that we don't know what the target function should look like. So the reason this is a good scenario for a neural network is that neural network can theoretically represent any arbitrary function. If you make the network complicated enough, if you have enough layers, if you have enough nodes in each layer, then there is a theorem which says this kind of network can approximate any arbitrary function. So it's very, very powerful. And in the case you don't know what the function looks like, maybe you want to have a very flexible model so you can indeed discover the true function. Now, the third scenario where it's good to use a neural network is that we want to use a neural network when it's not important for the model to be interpretable. The reason is, as you can guess, once we learn neural network, although it usually has pretty high prediction accuracy, but the model tends not to be interpretable. So we look at the neural network, we look at the weights and the nodes, and we might be wondering, what is it really doing? And this is really, really hard to explain to, I mean, definitely really hard to explain to a non-technical person. But sometimes even a CS person can stare at this neural network and really cannot figure out what is it trying to do. Next, let's think about some scenario or some reasons not to use a neural network. First one is regarding the network structure. Maybe more precisely, I should rephrase these slides as disadvantages, problems of using a neural network. First one is regarding the architecture. With a neural network, it's often very difficult to determine the network structure. In order to determine this, you need to figure out what number of layers we need, how many neurons do we need in each layer. And there are a lot of parameters you need to fix in order to fix the network structure. And secondly, once we learn the model, as I talked about in the previous slides, it's difficult to interpret what is it trying to do, except especially when we have a lot of layers and a lot of nodes in each layer. And finally, one big problem with a neural network is that it tends to overfit in practice. And you can sort of understand this intuitively. A neural network has a lot of parameters. It's a very flexible model. Remember how we talked about bias and variance trade-off at the beginning of this unit? So a neural network is a model that has really high variance. And this kind of model tends to overfit. Because it's so flexible, it can learn anything you throw at it. And it doesn't really distinguish things that are useful to learn versus noise that are not useful to learn. So far, I've talked about some scenarios where we should use a neural network and also some problems with using a neural network. Now let's think about this in contrast with decision trees. These two are the only model we've talked about in this course. So if we have a choice between these two models, which one should we use for a particular problem? So which of the two models should we use is going to depend on quite a few factors? And I've listed some of them right here. Let's think about them. First of all, what kind of data do we have? So neural networks, as I mentioned earlier, it's good for complex, high-dimensional data. So examples are images, audios, and even text. Where decision trees, they are good for the kind of data, I will say, called tabular data. So think of our GIFS data set. If you can represent the data set nicely in the table, that's a kind of tabular data. So in short, decision trees are good for simpler data, and neural networks are good for more complex data. The second factor is regarding how many data points do we have? How big of a data set do we have? A decision tree can work very well, even if you have very little data. Again, remember our GIFS training set example. We only have 14 data points. This is a ridiculously small data point. Still, decision tree could give us something reasonable. Whereas to train a neural network that works well, we are going to need a lot more data. Think about orders of magnitude difference in terms of the size of the data set. And also, because the neural network can easily overfit, there's so many parameters. That's also intuitively why we need so much more data to be able to train properly. It also depends on what kind of target function we're looking for. Do we know what kind of function might fit the data well? If neural network is much more powerful in the sense that it can model any arbitrary function, whereas decision tree is much more limited, we can essentially model any function that can be expressed as a nested if-then-else statement. So if the true function is not in this form of if-then-else, then a decision tree probably cannot capture it very well. So depending on how much you know about the target function, you can choose whether you want to be flexible or you want to use a more restricted form of the target function. Next, we have to decide on the architecture. So essentially, the hyperparameters in the model. For decision tree, this is easy. We usually, if we're just growing a full tree, we don't have any parameter. But if we want to prevent overfitting, then we can add parameters such as the maximum depth or the maximum number of nodes and things like that. We can add these parameters to prevent overfitting. But a neural network, on the other hand, just naturally has many parameters. The number of layers, the number of neurons per layer, which activation function do we use? How do we initialize the weights? And if we use gradient descent to train the neural network, then what learning rate do we use? We have to decide on all of these parameters, and all of them can be critical. All of them can critical influence the performance of the neural network on our data. The next question we need to think about is do we need to interpret the model that we've learned, the function that we've learned? Now, why is this important? Well, you might think that shouldn't we just care about whether the learn function performs well, has high prediction accuracy? Whether the interpretability is important depends on our application area. Imagine if you're using a machine learning model for some financial applications, such as approving mortgage or approving credit cards, or maybe you're using it to make, I don't know, admission decisions at schools, or even political decisions in the government. For a lot of these decisions, we have to be able to take the model learn that explained it to people why the model is doing what it is doing. Sometimes we need this transparency for something to be explainable to justify what the model is doing. Now, decision tree is a good choice for this because it's very natural. It's very easy to explain to people. People can easily visualize and understand what the decision tree is doing. On the other hand, neural network is really bad at this. Think about it as essentially a black box. It's often difficult, if not impossible, to interpret what a neural network is doing. So depending on your application, if interpretability is important, then decision tree is probably a better choice than a neural network. Finally, the two models also differ in terms of how much time they require for training the model and also once the model is trained, how much time does it take to classify an example using the model? As you can probably expect, decision tree is very fast for either. Fast to train decision tree is also fast to classify an example using a decision tree. On the other hand, a neural network is much more complex. So it's much slower to train a neural network model. Think about, you must have heard about difficult applications where it takes days to train a neural network. And even when you have a trained neural network, it might be time consuming to classify each example because you have to do a lot of calculations to get the output values based on input values. All right, so these are all the things I can think of. Now, what is the most important thing that you should get out of this slide, this discussion? I think the most important message you should get is that a neural network is not the magic solution to all problems in the world, right? There is such a huge hype around neural networks right now, so even when I was reading your project topic requests, I get a feeling that nowadays when people think of a problem, they automatically think, okay, I'm going to apply some sort of neural network, feed forward or recurrent or convolutional, and that's going to solve all my problems and done. Right, so this discussion is for you to realize that it is not always, a neural network is not always the best choice. So when you have a data set, you want to think about all of these questions and if first, maybe depending on your answers for some of these questions, you might decide to try a simpler model first. And in fact, that would be what I would always try to do first because a neural network is complex, it's difficult to implement and it's also difficult to train, right? You have to tune a lot of parameters to find out the best combination of parameters. And sometimes a simpler model might do as well as such a complicated model, such a complicated neural network, and it doesn't require so much more time to implement and to test. All right, that's everything I want to say in this video. After watching this video, you should be able to do the following. Explain situations in which we want to use a neural network. Then explain drawbacks of using a neural network. Identify situations where it's better to use a decision tree rather than a neural network. And then also identify situations where it's better to use a neural network than a decision tree. Thank you for watching. I will see you in the next video. Bye for now.