8 Dimensionality reduction
Tutorial: https://satijalab.org/seurat/articles/pbmc3k_tutorial#perform-linear-dimensional-reduction
Why do we need to do this?
Imagine each gene represents a dimension - or an axis on a plot. We could plot the expression of two genes with a simple scatterplot. But a genome has thousands of genes - how do you collate all the information from each of those genes in a way that allows you to visualise it in a 2 dimensional image. This is where dimensionality reduction comes in, we calculate meta-features that contains combinations of the variation of different genes. From thousands of genes, we end up with 10s of meta-features
8.1 Determine the ‘dimensionality’ of the dataset
Tutorial: https://satijalab.org/seurat/articles/pbmc3k_tutorial#determine-the-dimensionality-of-the-dataset
8.2 Run non-linear dimensional reduction (UMAP/tSNE)
Note we are making our UMAP before clustering.
Tutorial: https://satijalab.org/seurat/articles/pbmc3k_tutorial#run-non-linear-dimensional-reduction-umaptsne
Challenge: PC genes
You can plot gene expression on the UMAP with the FeaturePlot()
function.
Try out some genes that were highly weighted in the principal component analysis. How do they look?
8.3 Save
You can save the object at this point so that it can easily be loaded back in without having to rerun the computationally intensive steps performed above, or easily shared with collaborators.
saveRDS(pbmc, file = "pbmc_tutorial_saved.rds")