 Preparing your data so the programs can read it is probably the most important task in data mining. Today I'll show you how to load your data in Orange. Orange can read several data formats, such as Excel, Tab and comma separated files. The data is normally a table where data instances are in rows and data attributes are in columns. But why just do the talking and no walking? Let's make our own data. I'll use Google Sheets to create a simple data set. Say we have a group of people and we would like to know whether we can predict the gender based on their physical characteristics. Okay, our people will have a name, so we know who's who. Then we also know their gender, height and weight, and we know how they look like. So let's put down also their eye and hair color. I have this data set for my friends, Jill, Jack, Mark, Ann and so on. Their names are all strings that is text. Gender won't be a string, but a categorical value, because our people will belong to one of the two groups, male or female. My friends naturally have different heights and weights, which are numerical values. Some of my friends are tall, others short, some slim and others a bit chubby. The color of their eyes and hair are again to categorical values, since the eyes can be either blue, brown or green. And the hair black, brown, blonde or red. Now we have our data. Still, other than providing the data, I have not explicitly specified attribute types, so let's hope orange will guess them correctly. Now we load our data in orange. Let's copy a shareable link and paste it into a file widget. Let us first view the data in a data table. Orange correctly assumed the first column with names contains our meta attributes, but it incorrectly made the hair color our class variable. Hmm, maybe I should have gender as the last column in the table. But let us fix this in orange. We can rearrange the data with select columns widget. We'll put hair color attribute into features and gender in target variable box. One quick check in a data table and we're good. You can also save the data to your computer with save widget. It's best to save the data in orange's native tab format, since it automatically appends header annotations for attributes. You might want to define your data locally. I will use the same data as before, only this time I copy them to Excel. I will add two extra rows under attribute names. Set variable type in the first one and variable kind in the second. For the attribute type, I will use C for continuous, that is numerical attributes, D for discrete or categorical attributes, and S for string values. For the attribute kind, I will write class under gender and meta under the attributes that provide some extra information. Now our data is all set for the analysis. Today we've learned how to prepare our data, manually annotate it and subsequently adjust which feature is considered a class attribute.