 In the previous video, we explored loading data from a 10x genomics matrix file. In this video, we'll show you how to load the data from a spreadsheet file. We'll demonstrate this with a recent paper on gene expressions in the cells of a social amoeba called Dictastelium. In the supplemental information of the paper, there is a link to an Excel file with read counts and analysis. The file has multiple worksheets, with the first one containing the count data, and the second one containing normalized data. We'll work with the normalized data. To avoid confusion, let's first isolate it by copying it into a new file with a single worksheet. We'll save this new file to the desktop. Now it's time to load the data into orange. We'll first need the load data widget from the single cell widget group. Since the widget remembered the file list from our previous video, we'll first remove this list and then drag our data file from the desktop to the widget. This data includes 81 cells and over 11,000 genes. We should make sure that orange knows there is one header row and that the first column contains labels. Now we're ready to load the data. Let's first check the data in the data table. It's nothing special. Lots of zeros, like any single cell data set. Next, we will filter out the genes that are expressed in only a few cells. We'll use the filter widget. Filter on genes and the detection count and set the lower bound to 30 cells. There are about 8,000 genes that are expressed in at least 30 cells. Just for fun, let's estimate the distances between cells. We'll use the distances widget. Set the distance matrix to cosine because the number of genes or features is large and construct a hierarchical clustering. Here it is. The clustering looks very similar to the dendrogram the authors reported in the paper. In fact, if we select the lower branch of the clustering tree and check the IDs of the cells, they indeed appear in the same cluster, which is marked as cluster A in the paper. Our cluster contains cells d5, d12, d18, all the way through d90. The only difference is cell d3, which is likely present due to any differences in gene filtering and clustering parameters. We can also load other types of tab-delimited data. We'll demonstrate this with broad single cell portal. Let's find the data on dividing neural cells and download the expression data on neurons from the spinal cord. This time, the data comes in a zipped tab-delimited format. We'll use a new instance of the low data widget. Clear any of the past used files and drag the zipped file to the widget. We have 185 cells and over 25,000 genes. Just to check, here is a data table and a t-SNE visualization. Orange can also load other data formats, including loom, but that's enough for now. Let's explore what to do with all this data. Check out our next videos in the series for single cell data processing, marker genes, clustering and more.