 Hello everybody. In this lesson, we're going to be taking a look at beautiful soup and requests. Now these packages in Python are really useful. These are the two main ones that I use and I was first starting out with web scraping. It can get a lot of what you want done in order to get that information out. Now of course, there are other packages that you can use that may be a little bit more advanced. But again, this is just the beginner series in a future series. We'll look at other packages as well that have some more advanced functionality. So what we're going to be doing is we're going to import these packages. And then we're going to get all of the HTML from our website and make sure that it's in a usable state. And then in the next lesson, we're going to kind of query around in the HTML, kind of pick and choose exactly what we want. We'll look at things like tags, variable strings, classes, attributes, and more. So let's get started by importing our packages. What we're going to say is from BS for this is the module that we're taking it from, we're going to say import. And then we'll do beautiful soup. Then we're going to come down. And we're going to say import requests. Now let's go ahead and run this and hit shift enter. And it works well for me. Now if this does not work for you, you may potentially need to actually install BS for so you may have to go to your terminal window and say pip install BS for I'll just let you Google how to do that if you need to do that because it's pretty easy. But if you're using Jupyter notebooks through Anaconda, like how we set it up at the beginning of this Python series, then you should be totally fine. It should be there for you. The next thing that we need to do is specify where we're taking this HTML from. So what we need to actually do is come right over here to our webpage. And we need to get the URL. So we're going to go here, we're going to copy this URL. And I'm just going to put it right here for a second. And what we're going to do is we're going to using this URL quite a bit. So we just want to assign it to a variable. So to say URL is equal to and then we'll put it right in here. Now we can get rid of that. So now this is our URL going forward. This is where we'll be pulling data from. Let's go ahead and run this. Now we're going to use requests. And what we're going to do is we're going to say requests dot get. And then we're going to put in URL. Now this get function is going to use the request library. It's going to send a get request to that URL and it's going to return a response object. Let's go ahead and run this. As you can see here, I got a response of 200. If you got something like a 204 or a 400 or 401 or 404, all of these things are potentially bad. Something like a 204 would mean there was no content in the actual webpage. 400 means a bad request. So it was invalid. The server couldn't process it and you don't get any response. If you got a 404, that might be one that you're familiar with. That's an error that means the server could not be found. The next thing that we're going to do is take the HTML. Now if you remember, we come right back here and we inspect this. We have all this HTML right here. Now on this webpage specifically, right now it's completely static. It's not a bunch of moving stuff or anything like that. Usually when you're looking at HTML, if you're looking at something like Amazon and those web pages can update, but when you actually pull that into Python, you're basically getting a snapshot of the HTML at that time. So what we're going to do is bring in all of this HTML, which is our snapshot of our website, and then we can take a look at it. So we're going to come right down here and now we're going to say beautiful soup. So now we'll use the beautiful soup package or libraries. We need to say beautiful soup. And we're going to do an open parentheses. We're going to do two things. There's two parameters that we need to put in here. First, we need to put in this get request. We actually need to name this and we'll call this page. We'll say pages equal to and let's run this. And now we're going to put that page in here. And what we're going to say is dot text. So the page is what's sending that request and then the dot text is what's retrieving the actual raw HTML that we're going to be using. Then we're going to put a comma here. And what we need to specify is how we're going to parse this information. Now this is an HTML. So what we're going to do is HTML just like this. This is a standard that's already built in to this library. So we don't need to go any further, but it's basically going to parse the information in an HTML format. Let's go ahead and run this. Let's see what we get. And as you can see, we have a lot of information. And as we scroll down, I'll try to point out some things that we've already looked at in previous lessons. Something like this Th tag, that should be very similar. That's the title. Then we have these TD tags. And then of course, if we scroll down even further, we'll have things like ATR tags. So these are all things that we looked at in that first lesson when learning about HTML. Now again, we want to assign this to a variable. So we're going to say soup. That's going to say equal to this information right here. Now I'm not going to go into all the history behind beautiful soup. What I will say is the guy who created this beautiful soup library, what he said was is that it takes this really messy HTML or XML, which you can also use it for. It makes it into this kind of beautiful soup. So I just thought that was kind of funny. But that's why we're calling it soup right here. And we're going to go ahead and run this. And we'll come right down here. It will say print soup. And let's run it. And now we have everything in here. So we have our HTML, our head, we have some href and some links in here. And let's scroll down a little bit more. And then we have our body right there. And of course we have a bunch of information in here. Now in the next lesson, what we're going to be doing is learning how to kind of query all of this to take specific information out and basically understand a lot of what's going on in this HTML to make sure we can actually get what we need. Now if this looks really kind of messy to you and it just doesn't make a lot of sense, there is one more thing that I'm going to show you. And we'll come right down here. So we'll say soup dot pretify. And if you've ever used a different type of programming languages, pretify is very common in a lot of them, we'll just make it a little bit more easy to visualize and see. You'll notice that it kind of has this hierarchy built in. Whereas if we scroll up, there's no hierarchy built in. It's all just down this left hand side. So if you kind of want to view it and just kind of visually see the differences, this does help a lot. But it doesn't actually help a lot when you're querying it or using find and find all, which is what we're going to look at in the next lesson. So that is our lesson on beautiful soup and requests. In the next two lessons, we're going to be looking at find and find all as well as really diving into things like variable strings and tags and classes and all those things. And then in the last lesson, we're going to do kind of this mini project where we try to get all the data from this webpage that we've been using from that table and put it into a pandas data frame. So thank you guys so much for watching. I really appreciate it. If you liked this video, be sure to like and subscribe below and I will see you in the next lesson.