 Common caveat with any biological data is comparing multiple experiments. The so-called batch effects are often the primary source of reliability, and single cell gene expression is no exception. In this video, we will illustrate these effects and present suitable countermeasures. We will use a data set on immune system cells in human bone marrow. The 3,000 cells come from two experiments, control or resting cells, and those treated with interferon to reduce an immune response. Let's try to make sense of the data first. After applying scaling and normalization, a tisny plot reveals several cell clusters. If we color the dots representing the cells by the experiment source, we can see batch effects at work. The tering is driven primarily by the source of the experiment. Now, because the two experiments should contain similar cell types, we can attempt to match them with data set alignment. Data set alignment relies on canonical correlation analysis and finds shared directions in the space of the two data sets. Let's select 23 canonical components using the handy shared correlation plot. After discarding experiment-specific variation, the clusters remain, but they're now composed of mixtures from the two experiments. Neat! Let us now see what happens with the actual cell types. We will score the cells according to the expression of marker genes, a task we have explored in our previous video. This time, instead of entering our own list of markers, we will use markers from an existing database. Marker genes will help us identify different cell types in tisny plots, before and after batch correction. To combine cell type scores with the batch corrected data, we will merge cell scores and display everything in a scatter plot. Now we are able to interactively change the selection of marker genes and see the effects. Let's take dendritic cells and select all their markers. In the original tisny, we can clearly see that dendritic cells form two clusters, one for each data set. In the batch corrected tisny, the dendritic cells correctly form a single group. We can examine remaining cell types, such as naive t-cells or naive b-cells. Notice that in the batch corrected data, the cells of the same type appear close together, often in the same cluster. Nice! Data set alignment widget can unify data coming from different data sources and make it ready for downstream processing and analysis.