8 Dimensionality reduction

Tutorial: https://satijalab.org/seurat/articles/pbmc3k_tutorial#perform-linear-dimensional-reduction

Why do we need to do this?

Imagine each gene represents a dimension - or an axis on a plot. We could plot the expression of two genes with a simple scatterplot. But a genome has thousands of genes - how do you collate all the information from each of those genes in a way that allows you to visualise it in a 2 dimensional image. This is where dimensionality reduction comes in, we calculate meta-features that contains combinations of the variation of different genes. From thousands of genes, we end up with 10s of meta-features

8.1 Determine the ‘dimensionality’ of the dataset

Tutorial: https://satijalab.org/seurat/articles/pbmc3k_tutorial#determine-the-dimensionality-of-the-dataset

8.2 Run non-linear dimensional reduction (UMAP/tSNE)

Note we are making our UMAP before clustering.

Tutorial: https://satijalab.org/seurat/articles/pbmc3k_tutorial#run-non-linear-dimensional-reduction-umaptsne

Challenge: PC genes

You can plot gene expression on the UMAP with the FeaturePlot() function.

Try out some genes that were highly weighted in the principal component analysis. How do they look?

8.3 Save

You can save the object at this point so that it can easily be loaded back in without having to rerun the computationally intensive steps performed above, or easily shared with collaborators.

saveRDS(pbmc, file = "pbmc_tutorial_saved.rds")

7 PCAs and UMAPs

9 Clustering