 Hello, welcome to SSUnitech. So, still decide and this is continuation of SSIS tutorial. So, today we gonna discuss about term extraction transformation. So, what is term extraction transformation? So, it is used to extract noun or noun phrases in transformation input columns and provide occurrence in output. Next is, we can use term extraction transformation to discover the content of a data set. For example, we have a text file that contains the information of email messages that may provide useful information about product or depend on your business type. So, here we need to take care one thing. It can only work with either dt underscore wstr or dt underscore ntext data type and it will accept these data type in input and provide in output as well. If we have any column with text data and that column doesn't contain dt underscore wstr or dt underscore ntext data types and we want to use this transformation. So, data conversion transformation will help us. So, let's move to our scenario. So, this is our scenario. As we can see, we have a database for training history and we have a table that contains the information of emails and we want to analyze demanding courses. So, here as we can see, we have a OLDB source because we have a SQL server table that contains the information of emails. That table contains wire cat data type. That's why we have to use data conversion to convert wire cat data type to n wire cat data type. Next is we have to use term extraction to count the occurrence and then we want to load in our OLDB destination. So, let's have a look of SQL server table. Go to on SSMS. As we can see, we have term extraction table. Now, we can execute and we can see we have total 14 rows and we can check the data type for this. So, we can find it has wire cat data type instead of n wire cat. So, this is wire cat. That's why we need to use data conversion to convert this wire cat data type to n wire cat data type. So, go to on ssdt to implement the package. So, here I want to add one more package. Then we need to rename this package with term extraction. So, this is our term extraction. Now, we have to use data flow task. So, we can drag and drop this data flow task. After that, we need to double click on this data flow task. So, here we need to configure source, transformation and destination. So, our source is SQL server table. So, we have to use OLDB source. Now, double click on this OLDB source. Then we need to select the connection manager. So, click on new. As we have already made the connection. So, I want to use that one. Click on OK. Then we need to select the table. We have term extraction table. So, we can select and go to on columns. So, we have mail description. Now, click on OK. So, we have done our source. So, it would return wire cat data type, but we want unicode character. So, we have to use data conversion. So, we can drag and drop data conversion and connect with source. Then double click on this data conversion. Then we can select. So, as we can see, it's going to return dt underscore text. So, we have to convert this dt underscore text to dt underscore n text. Now, click on OK. So, we have done our data conversion. Now, we need to use term extraction. This is our term extraction. So, we can drag and drop. Then connect with data conversion. Now, double click on this term extraction. So, here we can see we have three tabs. First term extraction, second exclusion, third advance. So, here we can select copy of mail description. Then, then we can see output column names. So, this is term and then score. So, term will return the courses. So, we can rename this with courses and this is a score. So, we can rename with total count. So, it will return the count of the courses. Then move to advance. Exclusion tab will be covered later. Now, we have term type. So, in term type, we have three options. First is noun, second noun phrases and third noun and noun phrases. So, we gonna use noun and noun phrases. Second is score type. So, in score type, as we can see we have frequency and tfidf. So, tfidf is a predefined formula. So, I am not going to discuss about this. I am only discuss about the frequency. So, I am going to select this radio button. So, here we can see parameters. So, what would be the parameter? First is frequency threshold. So, in frequency threshold, we can define the value. What would be the minimum occurrence? So, I am going to define two. So, if any noun or noun phrases will occur minimum two times, then it will return in our output. Second is maximum length of term. So, I am going to set with 20. Then, we can see we have an options. So, in this option, use case sensitive term extraction. So, I am not going to select this checkbox. Once we select this checkbox, then it will treat as a case sensitive. Now, click on OK. So, here we can see we are getting a warning. Rows send to error output if rows will be lost. So, I am going to ignore this failure. We will click on this configure error out. So, here we can see we have errors. So, we can ignore failure. Then again ignore failure. You don't need to worry about it. I will cover this part once we will understand about the error handling. So, now we want to load this data in our SQL Server table. So, we have to use OLEDB destination. Then, we need to connect with term extraction. Now, double click on this OLEDB destination. So, here we have already made the connection. So, we need to select the table where we want to load the data. So, we want to load the data in term extraction destination table. Now, go to on mapping. So, as you can see total count has been mapped successfully because input column name and destination column names are same. Courses we have to map manually. Now, click on OK. Before execute this package, let's have a look of table. As we can see we don't have any record right now. So, now we need to execute this package. Click on start. As we can see five rows should be inserted in our destination. So, we can stop this package and then go to on SSMS. Then we need to execute this selected statement. As we can see we are getting the courses and number of occurrence. SSIS four times, SSRS three times, .NET two times, Power BI two times and SSIS again two times. Next is exclusion. So, here we have an option to exclude nouns and noun phrases. Those are not required. So, we have a table and that is term extraction exclude table. So, we can execute. So, here we can specify the noun and noun phrases. So, I have specified .NET. I want to truncate this table. Truncate table and then table name. We don't have any data in this table. Now, go to on shtt. Then double click on this term extraction. So, go to on exclusion tab. So, here we can see use exclusion terms. So, we can select this checkbox. Then we can see it would ask for OLEDB connection manager. So, we cannot specify nouns and noun phrases except OLEDB source. So, we have already made the connection. So, go to on tables. What is the table? So, we have term extraction exclude table. Then we have to specify the column. So, we have only one column that is exclude. So, now click on OK. Now we need to execute our package. So, once we execute our package then .NET will be ignored. As we can see we are getting only 4 rows. Last time when we run this package then that was 5. So, we can stop this package, go to on ssms and execute our selected statement. So, as we can see we are getting 4 rows and .NET is not all level here. So, thank you so much for watching this video. If you like this video, please subscribe our channel to get many more videos. And don't forget to press the bell icon to get the notification of our newly uploaded videos. Thank you so much.