 Hey guys welcome to SSMU Tech, so see in this side and today we are going to see about the CASA data set in SEO data factory. So what is the CASA data set? So SEO data factory has added the ability to CASA your data stream to the sync that writes to the CASA instead of the data stores. So obviously we have seen we have loaded the data into the physical destinations but now our aim to load the data into the CASA Spark and after that we will use that sync. In mapping data flow we can reference this data within the same flow many times using the CASA lookup. So go to on the browser and we will try to understand the requirement and then we will implement this. So here we are having this sales file in the blob storage so we just want to load this sales file from here to the SQL Server table. Let me go into the SSMS. So under this SSMS we have seen this table which is the sales table. So this sales table is already having the data here. If you can go and check in the source then we can see we don't have the sales order ID column here. It has only sales order date, customer ID, quantity, value and country. So the sales order ID will be going to automatically generated. So one way we can modify the existing table and here we can mark this as the identity column then it will be going to automatically incremented one by one but we don't have the permission to do anything on the destination level that is on your database. So we have to write something by which we will be going to generate this sales order ID in the Azure Data Factory. So first what we have to do we have to pick the maximum sales order ID from this table that is a three and we will be going to put this into somewhere inside the Azure Data Factory. So that will be the caches sync by which we are going to keep this and after that when we are going to load the data into this table then we will be going to use this sales order ID which is the maximum from this table. So let me go in the Azure Data Factory here and we will try to implement this in practical. So here let me add a new data flow. So we can click on this and new data flow here and let me on this debug option and here let me call this data flow as caches sync. Now here we can see we are required to add the source. So our source is the sico server and the sales table. So we have already created the data set for this. So let me click on this source and here go to the data set and here let me try to select the data set that we have created. Let me try to open this so we can verify. So this data set is going to refer the same table which is the sales table so that we can see. Let me close this. But here we don't want to keep this as it is because this is going to hold all the columns as we can see sales order ID and other columns. Our main purpose is we will be going to keep the maximum from the sales order ID. So how we can modify this so we can go in the source option and here we can see the option for query. In the data set we don't have that option. Let me try to open this again. Here we are required to select the table. We cannot use the query here. So under this we can use the query in the source option and here we can specify the query. So let me try to copy this query which we have written here and paste it there. Let me try to import the projection. So it will be going to have only one column and that column is the SOID. So and this is the maximum SOID from the sales table. So this is loading we need to wait. So here we can see successfully imported this projection. Let me go into the data preview and try to refresh. So here we will be going to have only one column that is the sales order ID and the value is 3 that we can see. So let me put this into the sync and the sync will be cache a sync. So as we have selected this sync option here, here we have already discussed about this data set. We have already seen about the inline data set. Here we can see the cache. So once we are going to select this cache, then here everything got disappeared because it will be going to store inside the spark cache. Now let me try to call this as max SOID. So this is your sync name. So we have done this part. Next let me try to load this sales data into the table. So let me try to add another source and this source we have created for getting the data from the Azure Blob Storage for the sales file. So as I have already created the data set, so let me try to select that data set. Let me try to open this so we can verify. So this is pointing to input folder of the sales.csv file. The same file that we are referring here under the input folder of the sales.csv file. Now let me go into the data preview and try to refresh. So we will be going to see all the data that we have seen inside the Azure Blob Storage sales file. Now as we can see the data, here the sales order ID, one new column that we want to generate. So we have already seen about the serogate key transformation. So let me try to use the serogate key transformation here. So this is the serogate key transformation and this transformation is going to generate a new column with the incremented value whatever we will be going to specify. So here let me call this as SOID and it is going to start from 1 and step value is 1. So it will be going to have values like 1, 2, 3, 4, 5 like that. So let me try to modify this SOID column values whatever the value that we have already inside the sync that is a cache sync will be going to add that over here. So for example, if this value is going to get like 1, 2, 3, then in sync we have already value 3 so we will be going to add then the value will be 4, 5, 6 and that will be the actual SOID value. So for that we have to use the derived column transformation. So let me use that and here we want to modify the SOID. So let me try to use the SOID column here. In the expression we are required to write the expression. First of all as we can see the cache lookup option is here. So we can select that so max SOID is coming here. So we have to use this max SOID from the cache sync. We have to use the outputs of this and here we have only one column. So we can select like 1 then we have to specify the column name. So column name is the SOID. So this expression we have to write to read the data from the cache sync. Let me try to refresh and we will be going to see whatever the data that we have on this. So it will be going to have 3 as we can see it is going to have the value as 3. Now what next we are required to do we need to add the SOID that we have generated from Sergei key transformation. So if we can go here and let me minimize this go to the all and here we have the SOID column. So let me use the SOID and not trying to refresh this. So as we can see it is going to reflect an error because this SOID value is coming as the string. So what we need to let me try to cancel this and discard these changes and let me quickly go here in the sync location and let me try to refresh this sync again that is the cache sync. So this sync is having the string value here. So that is why we are unable to do that as we can see the string. So how we can remove this error let me go in the source of this sync and here let me go into the projection here overwrite the schema and here we need to select the integer. So after making this change let me go into the sync and try to refresh. So data type of this will be changed. Now this value will be having the data type as integer that we can see 1, 2, 3 on this. Now let me go into the derived column and after that let me go here. Let me go in this setting and here write the expression again. So here the cache sync let me use that and after that the outputs then 1 dot SOID plus SOID from the table. Now we can see here we do not have any error let me try to refresh this. So now we will be able to see the output data here. So as we can see the actual value was 1, 2, 3, 4 like that and now values are updated with the 4, 5, 6, 7 like that. Let me click save and finish. Now let me try to add the sync here and this is your actual sync which is the table. So as we have already created the data set so let me going to use that one. Let me go into the mapping and disable auto mapping we need to do this manually. So SOID should be mapped with the SOID. Customer ID should be mapped with the customer ID. Customer ID should be mapped with the quantity and value should be mapped with the value. Now everything is mapped successfully. Let me go into the data preview and try to refresh this. So it will be going to insert the data and the value will be going to start from 4. So here we can see that. Let me try to publish this quickly and here let me go in the pipeline to execute this and we will verify the data. So open this new pipeline here and let me use the data flow transformation. So as publish is completed so let me use the same data flow that we have created and let me try to debug this. So once it will be executed then your data should be inserted into the destination table. As of now we can see only 3 rows. Let me go this is in progress so we have to wait. So as we can see this got executed successfully let me go. So here we can see and we can verify like we have all the data in this. So in this video we have seen how we can use the cache async for a single value. For example if we are having a table and we want to do the lookup by that table and keep that data into the destination. So how we can do that we will see in the next video. So thank you so much see you in the next video.