 Hello, welcome to SSUnitex, so we'll decide and this is continuation of PySpark tutorial. So in the last video of this video series, we have seen about the data types inside the PySpark. So in this video, we are going to see how we can use the data types for defining the columns inside a table. So first, we will see how we can create a table with the column definitions. We have already seen inside the SQL Server, we are required to specify the create then the table name, then what will be your table name, we need to specify then the column name, data type and the constant. So this is the syntax we have to follow by defining or creating the table inside the SQL Server. But inside the PySpark, here we can see the little bit difference. So the first thing, we need to import all the required data types. So here, as we can see, we are going to import the data type as stuck type, stuck field, integer, string and decimal. So these we are going to import without importing, we cannot use it. So this is predefined libraries and by using those libraries, we can import it. So those libraries are available inside the PySpark. Now we need to define the schema. So how we can define the schema? So for defining the schema, first we should be going to start with the stuck type. So what is the stuck type? So stuck type is a type which will be used for defining the schema inside the PySpark. And it will be taking the column names as an input parameter. So inside the input parameters, we can see this is the first one, this is the second one, this is the third and fourth. So total four columns we are having, so that's why we can see four input parameters. So next we can see the stuck field. So stuck field is used for defining the column name and the data type of that column. So the first one, we can see ID and what will be the data type, that is the integer type. Last parameter is indicating whether this will be a labor or not. So as we can see, true false, we can define it here. Now the next one, we can see name, then the age and then salary. So we are having these four columns in the schema. So this is a syntax we have to follow for defining the schema. And the last, if you want to create any data frame, then simply we have to use the spark.create data frame. Then inside the bracket, first will be your data and second will be your schema. So I hope you have understand here. Let me quickly go inside the browser and we will try to implement this in practical. So here, let me try to import all the required data types first. So how we can import? We need to use the from, then the pyspark.sql.types and we can use the import. And after that, either we can specify the required types or we can use the astic as well. So what astic will do? Astic will be going to import all the types. So instead of going to use the astic, better we should be going to specify one by one. So first one is the stuck type. Second one is the stuck field and one is the integer type. Then we have string type. Then the decimal type. And then the date type. So all these we can simply import. So after importing, we have to specify the schema. But before going to specify the schema, let me quickly create a data. So for creating the data, let me use a variable as data. And inside this variable, let me try to use the total four columns. First is the your ID. Second will be your name, third parameter will be your age. So age could be 30 and fourth parameter could be your salary. So salary I am going to specify as 4000 maybe. So this is the one data we have. Let me use another row here. So this could be having ID 2, name could be Abu and age could be 32 and salary could be 5000. So this is having the actual data. But as we can see this data is not the proper one. It does not have any headers. So we need to specify the schema and we will try to bind that schema with this data. So how we can do that? Let me try to use a schema as a variable. Now here as I told you, we are required to use the stuck type. So here we can simply use the stuck type and inside that we can specify all the required columns. So for that first we are required to use the stuck field. Then it will be asking total three parameters. The first parameter will be indicating the name so that is ID. Second parameter is indicating the type. So I am going to use the integer type and the last parameter it is saying whether that is a label or not. So I am going to say like true. So this could be having the null value there. For the second one we can simply copy the first row here and add another column here and let me call this as name and the type this should be string type. So let me use the string type here. Let me use the another one and this is for the age. So let me use the age and the last one could be for the salary. So let me use the salary here. So now we are good with the schema. We have defined the schema as ID name age and salary. Now let me try to bind this schema with the actual data. So how we can do that? Let me try to create a new data frame. So spark dot create data frame. Now inside this first we are required to specify the actual data. So that is the data and second is the schema. So this is the schema. Now let me try to see whether we are successfully able to see the data from this data frame or not. So here it is saying this error. So this is because this type we have not used the bracket. So let me use the bracket in all the places and now let me try to execute it. So this is again saying error because first we are used this small bracket then the big one. So we need to reverse this bracket. So this is the syntax you have to remember or you can also check it from Google. So here as we could see it is having total 4 columns with the ID name age and salary. So all these 4 columns are here and we are having the data there as well. Now in the next video we will see how we can read this particular CSV file and the column names that we can see SOID, SO date, item code. Instead of having all these it should be having the SO underscore ID, SO underscore date. So whatever the name of the columns we want to specify we should be specifying and use that particular schema along with the data type. Thank you so much for watching this video. See you in the next video.