 In our previous video on principal component analysis we used wine dataset. These data are the result of a chemical analysis of wines grown in the same region in Italy, but derived from three different cultivars or vines. The data include 13 features reporting on quantities of chemical components. How do I know which chemical components are the most significant for differentiating between these vines? With rank of course, this widget scores features with several scoring methods based on the relation with class. Connect file widget with rank. Rank displays two scoring methods as a default, but we can display more if we want. Say I want to see the scores for gain ratio, genie and relief F. Now I want to select the features with the highest relief F score. By default the top five features are selected and are already on the output. Now I want to see how well are these features related to class. Let's use some visualizations for that. Connect box plot to rank and inspect the first feature. Use group by wine to see the results for each class separately. Box plot reports on min, median, variance and quartiles of each feature. Min is displayed as a vertical blue line, median as yellow. The blue highlight denotes variance. While the dotted lines display the first and the fourth quartile. The wines seem to have very distinct distributions of flavonoid concentration. Seems like this feature separates the class very well. But there must be an even better way of inspecting our features. How about distributions? Let's see. Distribution widget displays a value density plot for a given feature. We can display value distributions for each class separately. For flavonoids these three distributions seem to be well separated. Flavonoids are likely one of our most important features in the dataset. As the separation is less pronounced with other features. Rank widget can score and rank features both for classification and regression. Say we want to analyze a bit dated housing dataset. Where we would like to check which feature best correlates with the house price in Boston suburbs. Seems it's the economic status of inhabitants and the average number of rooms. Sort of obvious but it's still great to see this directly from the data. Today we've learned how to determine which features are the most interesting in our dataset. And how to use feature scores for plotting interesting visualizations. Almost every data mining problem describes the data with features. Thus making feature scoring one of the best love techniques in the field.