 In this video, we will construct data analysis workflows in Orange. Assuming you have already downloaded and installed Orange on your computer, we can begin. Orange opens with a welcome screen. We won't need it here, so I am closing it, exposing the main window of Orange. There is a toolbox with icons of oranges, components on the left, and a blank canvas on the right. Orange supports the analysis of data through the construction of workflows. Orange workflows contain components that read, process, and visualize data and models. We refer to these components as widgets. Let us start with the datasets widget that can read data that has already been prepared on the Orange server. We place an instance of the datasets widget on the canvas by clicking on its icon. Here it is. Now I will double-click on the icon to expose the content of the widget. It lists various datasets, and I will use the one on socioeconomic data. The dataset is called HDI, and let me find it by typing its name in the filter box. Here it is. To load the data, I will double-click on the HDI row. Notice a green circle that indicates that the data is loaded. The widget reports 188 rows and 52 columns in this dataset. But where is this data? The dataset widget just loads the data. To display it, we need another widget called Datatable. I will place it on the canvas by clicking on its icon. If I open the Datatable widget, I find it empty. This is because the data we loaded with the datasets widget has yet to be communicated to the dataset. I will drag a link from datasets to the dataset widget to display this communication. The dataset widget displays the data as soon as I do this. Orange widgets communicate. As soon as you change something in one widget, it outputs the result and the widgets downstream in the workflow update their contents. For example, I can open a datasets widget and choose some other dataset, say on liver disorders. Notice the immediate change in the dataset table? Let's switch back to the HDI dataset. Here it is. It contains the data on 52 countries that are described with 52 features. The first two features are the country's name and human development index, and all other features describe some socioeconomic features. For instance, the first one provides average life expectancy, which is the country where they live longest. I can click on the column header for sorting. It seems you would need to move to Hong Kong or Japan if you want to live long. Besides life expectancy, there are many other features in this dataset. The next one is mean years of schooling. Is this feature correlated with life expectancy? One way to check this is by looking at a scatter plot. I could find the scatter plot widget in my toolbox, but instead, let me show you how to introduce it to my workflow in a different, faster way. I'll drag an output link from the datasets widget and release the mouse somewhere on the empty part of the canvas. Orange now asks me what widget I want on the receiving end, and I can start typing scatter plot. Here it is. I click on its line, and an instance of scatter plot widget comes up. Double-clicking on its icon by default shows me the first two features in my dataset, life expectancy and mean years of schooling. Looks like the two are related. The longer you go to school, the longer you live. It is not this simple. We should not equate correlation with causation. Yet it is interesting to see that people live longer in the countries that better care about education. I now wonder which are the countries where people spend a long amount of time in school and have a high life expectancy. Let me select the data points that represent those countries in the scatter plot. In the workflow, the scatter plot widget can emit the information on its output as its dotted line is ready to be used. I would like to display the data selected in scatter plot in the spreadsheet. Therefore, I need another data table, and I will attach it to the output of the scatter plot by dragging your communication line from the scatter plot, and then from the list of widgets selecting data table. Double-clicking the data table 1 now shows the data of the countries I have selected in the scatter plot. These include Australia, Germany and Canada. So if you like school, these are the countries you should take a closer look at. Which are the countries with shorter lasting education and also ones with shorter life expectancy? We can select them in the scatter plot. See how the display in data table 1 changes immediately? There are also some countries where people attend school but life expectancy is low. Let me select them in the scatter plot. There are Swaziland and Lesotho. We have just learned how to construct a simple workflow in orange. In our workflow, we have loaded the data using the data sets widget, displayed the entire data set in the data table and explored the relationship between a pair of variables in the scatter plot. In another data table, we displayed the data on countries selected in the scatter plot. In the following video, I will show you how we can slightly change the workflow to choose the data in the data table and highlight the selection in the scatter plot. And I will talk about saving your work and loading the workflows into orange.