 Hey guys, welcome to SSUnitec social decide and this is continuation of SEO Databricks tutorial. So before going forward in this video, if you haven't watched the last two videos of this video series, so I would strongly recommend to watch those videos. So in the first video, we have discussed about the introduction of the SEO Databricks. And in the second video, we have discussed how we can create the SEO Databricks service inside the Azure portal. So let's get started with today's video. So in this video, we will be understanding about the top level concept those we are going to use inside the SEO Databricks. So what is those? So first, that is the workspace. So we have to understand about the workspace first. So the workspace is an environment for accessing all your Databricks assets. So what it means? So this is the place where we will be accessing all the Databricks assets. And those assets could be your notebook, libraries, dashboards, and experiments. So experiments is nothing but the ML flow and into the folder and provide access to data object and computational resources. So this is the one place where all these objects we can create whatever we want to we can develop it here. So this is the one place. And as I told you for the notebooks, so this is a web based interface to documents that contain a renewable command visualization and narrative text. So whatever we want to do, like all the coding or the development, everything we have to do inside the notebooks only. And here we can use the different languages by which we can write our code. Those could be Python, Scala or SQL languages it is supporting. Next is the folders. So folders are nothing but it is used to maintain the grouping and hierarchy. So let's assume if you are working in a project and that is the ERP based project. And inside the ERP we are having different modules like the sales module purchase module and inventory module. So we don't want to keep all our development in a single folder. We want to create the different folders and by which we want to keep our development there. So for example, our base folder will be your project name maybe the P1 and inside the P1 we can create the different folders for managed all these things. So inside that we can call this as sales then purchase and after that we can have the inventory. So like we can create these folders and whatever the development is needed if that is based on the sales we can write and create the notebooks under the sales or the purchase and inventory. So depending on that we can manage. Next is the library. So what is the library? So library is nothing but it is a package of code available inside the notebooks or job running on your cluster and Databricks runtime includes many libraries and you can also create your own library. So you can simply understand it is having the predefined libraries inside the Databricks. Whenever it is required in your code you can simply use those libraries and here we will also see the option for creating the new libraries. So we can create those libraries and in future we can utilize those libraries. So libraries you can simply understand it is something like the functions inside the SQL server. So it will be going to create a single time and we can use that function in the future coding as well. So this is very similar to the libraries. Next is the MLflow. So MLflow is again nothing but the collection of the MLflow runs for the training a machine learning model. So MLflow is going to use for that. So this is the first thing that I wanted to discuss about the workspace. Let me quickly go in the browser and here inside the workspace as we can see we have the option for the shared and users. So let me go in the shared and inside the shared we can create the notebook libraries folders and MLflows. So let me create a new notebook and while creating a new notebook it is asking to create a cluster. So as of now we have not created the cluster. So let me leave this and let me call this as test and click on create. So it will be going to create a new notebook but that notebook will not be attached with any cluster. So we cannot run anything here. Now here in the top side we can see by default it is selected the Python language but you can also select the SQL Scala or so any one of these languages you can select it from here or you can select it from here as well. So like these two places by which we can select it. Now let me discuss the next concept which is the data. So inside the data it is very important. So data bricks file system. So what is this? So data bricks file system is a distributed file system mounted into the Azure Databricks workspace and our label on Azure Databricks clusters. So simply we can say it is the storage of the file and in our scenario because we are going to use this inside the Azure. So Azure blob storage will be created behind the scene while we are creating the clusters. So don't worry if that is not very clear. I'll be going to see that in our upcoming videos. So as of now you can only understand the Databricks file system is a system or the collection of the files. Those can be available on your Azure blob storage. So that is the Azure Databricks file system. Next is the database. So database as I told you we can also create the databases and the tables inside the Azure Databricks. So this is the same thing. If you are having knowledge inside the SQL server then you can understand like the database and the tables are nothing but database is the collection of all the objects and the table will be going to have one object inside the database. So this is something like that. Let me quickly go inside the portal and let me show you about the data menu that you can see here. So once we click on that then we can see the Data Explorer and here we can write our SQL queries like here we have not created any table at yet but we can create the table. So that we'll see in our upcoming videos. As of now you can only understand like this is the data pane where we can integrate or your mount your Azure Databricks blob storage. So Databricks will be accessing the blob storage directly from here. Next is the compute. So compute is very important as the cluster is the backbone of the Azure Databricks. So what is the cluster? So a cluster is a computation resources and configuration on which you can run your notebooks and jobs. So there is two type of clusters available inside the Azure Databricks. One is the all purpose cluster. Second is the job cluster. So like these two type of clusters are available and we can write our code that is written inside the notebook by running inside the clusters. So this is very important. So I'll be going to record a separate video where we'll see detailed explanation about the clusters. So don't worry for now. You can only understand inside the compute we are having two type of clusters all purpose cluster and job compute cluster. Next is the Databricks runtime. So Databricks runtime is also very important. So you have to understand a set of core component that run on the cluster managed by the Databricks. Databricks offers several type of the runtimes. So we can simply understand it is a set of core component that runs on the cluster. So it is going to run inside the cluster and Databricks will be managing this. So Databricks runtime is nothing but the core component of the cluster. Next as I told you it is having several types. So the first type is the Databricks runtime with the conda. So this is one type of the runtime. Second Databricks runtime for the ML which is for the machine learning purpose. Next is Databricks runtime for the genomics. Next Databricks light. So these are the runtimes available inside the Databricks. Now the authentication. So authentication part is very important because here we can provide the access. Let's assume if we are having 10 developers in your project and all those are working in the same project. So on that project you can simply providing the access as per your need. So for example like the we are having the junior developers. So we don't want to provide the access. We will be restricting the access for all the resources so we can restrict by using the authentication. So it is having three type of the authentications. The first one is the user. So a unique individual who access to the system is user. Second is the group. So it is a collection of users. So we can provide the access to the group. Whatever the users or label on that group will be inheriting the access from that group. Last is the access control list. So this is very important. So a list of permissions attached to the workspace cluster job tables and ML flows. So we can understand it will be going to provide the access on the workspace level cluster level job level table level and ML flows. So on these level we can provide the access for any individual. So an ACL specifies which user or system processes are guaranteed access to the object as well as what operation are allowed on that asset. For example if we are having a notebook and we can create the notebook but we don't have access to delete the notebook. So that type of access we can add inside the ACL. So that we'll see in our upcoming videos. Don't worry for now you can only understand we are having three different type of the access one on the user label. So that will be on the individual user. Second is the on group. So if we are having a group and under that we are having multiple users so we can provide the access and third one is the access control list. So we can restrict on the object level like what type of access we want to provide. I hope guys you have understood about the top level concepts inside the Azure Databricks. So thank you so much for watching this video. If you like this video please subscribe our channel to get many more videos. See you in the next video. Thank you so much.