 Now, I love YouTube and I'm very excited to bring this course to you here on YouTube. And what it's going to be about is about business statistics or economic data analysis. Now data is really everywhere and if you want to take your own company anywhere, if you work for a company and you want to get involved in something new, you want to extend yourself, perhaps get involved with some of the teams, work with data, this is really the course for you. Now you can really learn about data analysis everywhere, there are courses everywhere and I urge you to look around because somewhere you're going to find a course or someone explaining this that you are really going to understand. Maybe this course is for you, I hope so, otherwise look for a course that will help because really statistical analysis, data analysis and business is becoming so important. Now we're going to use Python in this course. Now I absolutely love Python, I think there are two main open source programming languages to use are and Python but I really think Python is on the up, it really is such an exciting community and if you read anything online, Python is really where it's at. You can use Python for anything, I can teach you how to do data analysis, statistics with Python but you can use Python to write games if you want to develop websites, write applications, it is such an enormous language and it's so easy to learn. So we're really going to stick to Python, I'm going to show you how to install it, where to get it free of charge and that's the beauty of this open source movement that we are seeing, data analysis done free of charge. A little bit more about myself, I am a surgeon at Khorotyskir Hospital but I work a lot with data analysis and research as well. Before I was in academic practice as a surgeon I worked a lot in private practice and private practice is really a business and you've got to run it as a business. Statistical analysis and business is so important. One thing that I want to do in this course is just connect with you in some way. This is not a classroom, I'm not coming on a stage and you write down what I say and you write an exam about this. I want us through this medium to at least communicate even if most of the communication is going to be one way, but I truly believe as I do when I teach at the university is that you have to know something about your lecturer, you have to connect with that person, you have to know something about them. It cannot just be someone who just comes along and then goes away and there's no connection between the two of you. So I want you to learn a little bit something about myself and I want you to learn something about this environment called Cape Town. Cape Town is a fantastic place. If you ever get the opportunity, come and visit us here in Cape Town. So for many of these lectures, you are going to see just a little bit about Cape Town. I'm going to show you some of the businesses. I'm not connected to any of these businesses. I'm not getting paid by any of these businesses, but I'm just going to show you around the beauty of Cape Town. Now the first place I want to show you is called Spear. Spear is a winery, a wine farm, quite an old established wine farm. It's not about the wine. I'm not going to show you anything really about the wine and it's not about drinking wine or alcohol and anything. Really, it is this wonderful farm that you can take, someone that you know yourself. You can take your whole family and you can have lovely picnics there. There's so many restaurants. You can go grab a basket, buy a few things, sit outside. You can even book a space for yourself beforehand. It is well worth it, well worth place to visit. After you come on the strip with me around Spear, I want you to think about the business. Imagine how much data a big wine farm like that might collect. And the way that we are going to look at data analysis from any company's point of view is for that data to be in tabular form. You've seen this before, Google Sheets, Microsoft Excel, some open source version, LibreOfficeCalc, a spreadsheet of data, rows and rows of data, columns of data. Data is what it's all about and that's what this first lecture is going to be about. Now I want you to jot down a few things as you're trying to remember because these are the things that we're going to talk about. The four main types of data. The first two are categorical data types and the second two are numerical. Our first one is going to be nominal data. And then I'm going to move on to ordinal data. So those are the two categorical types. The two numerical types that you must be aware of is interval data and ratio type data. There you go, ratio type data. Both numerical data types, but there's a slight difference between the two. So any of the data that's collected in the business will be one of these four major data types. And you've got to know that column in the spreadsheet, that tabular data, what that is all about. Now there's another way to classify at least numerical data and there's something we will also have a look at and that is called discrete data and continuous data types. So those are the things that we are going to look at in this first lecture. Before we go there, come on over to My World. Let's take a short trip through SPEAR. With a very familiar looking program, it is a spreadsheet. Irrespective of what the program is, as I say Microsoft Excel, you can go Google Sheets, LibreOfficeCalc doesn't matter, it's a spreadsheet. What we're looking at here is tabular data. Tabular data because it is a table. A table with rows and rows. Here's a single row of data, a second row of data, a third row of data. And each row of data represents one object, one sale, one customer. Everything about that row should be about some specific thing. Now seeing we were just on a wine farm, let's just stick to the topic of wine. So it's got nothing to do with the wine, nothing to do with alcohol. Let's just look at the data and the data types. Look down these columns. There's a column, there's a column, another column, et cetera. And it is this column, the whole column that is, contains data that is of a specific type. I like to call each of these data points a data point value for a specific variable. Grape here is a variable. Grade is a variable. Let's assume I know nothing about wine making, by the way. Let's just assume that there's five grades, grade one to grade five. Grade one being the grape is excellent. Grade five being it's not a very good grape. Let's make up something that is the temperature that it was outside when the grapes were harvested. It's in Celsius, it's in Fahrenheit there and the quantity, kilograms, pounds, doesn't matter. It is a quantity. So if we look down these variables with these names, the data point values are of a certain type. So the data point values of a variable is of a certain type. Let's start with the grape. Let's imagine there are these Pinnatars, Pinnatars, Shannon, Cabernet, Chardonnay, Molo, all these different types of grapes that wines are going to be made of. As I say, I don't know much about these. They are all in different parts of this wine farm and each row tells us that there was some Pinnatars. The grade from that harvesting was grade four. It was 20 degrees Celsius or 68 Fahrenheit and 166 kilograms or pounds or whatever was harvested. So that's one harvest. So if we look at these data point values for this grape variable, we notice that they're words. We can't put them in any particular order. We can make up some human order to them. We can say, well, let's do them alphabetically. We could say, let's put them in some form of the way, you know, order in which we like them. But that's not a natural order. I can come up with any order. There's no reason to say a Molo must be higher than a Chardonnay. Somehow one might be more expensive. As I say, these are all just human factors that we can attach to this variable, this data type. So it is a categorical variable type and the specific type is nominal. So it's a nominal categorical data type. We cannot put it in any kind of natural order. The only thing we can do that is really natural is we can count. So if I look down this list, I can count how many times Chardonnay appears, how many times Molo appears in this column. Let's move on to grade. It's a bit of a trick. It is not numerical. You might see a number there, but it is most definitely not a numerical variable. It is still a categorical, still from a category. This time though, it is ordinal. We can put some order to it, grade one to grade five. There's a natural order there. We can still attach something, human to it, one being very good, five being bad, or one being bad and five being good, doesn't matter. There is an order to this, a natural order. It is not a number though, in the sense that there is no specific set defined difference between these. The difference between grade one and two is in no way similar to the difference between two and three, even though the numerical difference is one. How do you define what the difference is between these? You've got to just come up with some arbitrary thing. Think about it this way. You can also not say that a grape, say one is bad and five is very good grapes. Let's go in that direction. You cannot say that because four is twice two, that the grape with a grade four is twice as good as a grape of grade two. It's not normal. You can't see numbers in that way. And it's very important when you look at some business data and it's numbers, but they do not represent numbers. They represent categories. Once again, the only thing that you can really do here is to count, count how many ones there are, count how many twos. You can't really express some mathematical operation on these, they are categories, they are not numbers. Now let's move on to the two numerical types. Remember we had interval and we had a ratio. Here we have temperature. We see temperature in Celsius, temperature in Fahrenheit, whichever one you prefer, whichever part of the world you come from. The thing about these are, they are numerical, but there is no two zero. Zero degrees Celsius as where water freezes under normal circumstances at sea level, 32 degrees for Fahrenheit scale. That is not a two zero. It does not mean the absence of temperature. Even if you go down to the zero Fahrenheit, that zero does not mean an absence of temperature because you can go drop below zero. The absence of temperature is actually on the Kelvin scale and that's zero Kelvin. So these are numbers, but there's only an interval type here, in other words the difference between say 30 degrees and 31 degrees is the same as the difference between 50 degrees and 51 degrees. That one degree is really a set-define thing, but the zero here is not the absence of something. So be very careful, this is an interval, the numerical data type. Last one here is quantity. If there was a zero here, that is really the absence of anything. Nothing was harvested on that specific occasion. Zero really exists. And that's why it's called the ratio type because I can now say that if I get 40 pounds or 40 kilograms and I take 20 pounds or 20 kilograms, the 40 is twice more than the 20. But I cannot say 60 degrees is twice as much as 30 degrees because it really isn't on the greatest scheme of things on the Kelvin scale, 16, 30, very close to each other, one is not double the other on the Celsius and Fahrenheit scale because we don't have this absence of a quantity at zero. So be very careful when you do these, especially when it comes to this ordinal categorical type. So remember this, we have on the left-hand side here, we have nominal categorical, ordinal categorical, interval type numerical, ratio type numerical. The four data types that we're going to see in business analysis come up again and again and again. Okay, I hope to see you in the next lecture.