4 Clustering

Why do we need to do this?

Clustering the cells will allow you to visualise the variability of your data, can help to segregate cells into cell types.

4.2 Choosing a cluster resolution

Its a good idea to try different resolutions when clustering to identify the variability of your data.

resolution = 2
#pbmc <- FindClusters(object = pbmc, reduction = "umap", resolution = seq(0.1, resolution, 0.1), dims = 1:10)
pbmc <- FindClusters(object = pbmc, reduction = "umap", resolution = c(0.1, 2), dims = 1:10) # TEMP
#> Warning: The following arguments are not used: reduction,
#> dims

#> Warning: The following arguments are not used: reduction,
#> dims
#> Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
#> 
#> Number of nodes: 2638
#> Number of edges: 95927
#> 
#> Running Louvain algorithm...
#> Maximum modularity in 10 random starts: 0.9623
#> Number of communities: 4
#> Elapsed time: 0 seconds
#> Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
#> 
#> Number of nodes: 2638
#> Number of edges: 95927
#> 
#> Running Louvain algorithm...
#> Maximum modularity in 10 random starts: 0.7002
#> Number of communities: 16
#> Elapsed time: 0 seconds
# the different clustering created
names(pbmc@meta.data)
#> [1] "orig.ident"      "nCount_RNA"      "nFeature_RNA"   
#> [4] "percent.mt"      "RNA_snn_res.0.5" "seurat_clusters"
#> [7] "RNA_snn_res.0.1" "RNA_snn_res.2"

# How many clusters (and how many cells in those clusters) do we get at different resolutions?
table(pbmc$RNA_snn_res.0.1)
#> 
#>    0    1    2    3 
#> 1190  688  416  344
table(pbmc$RNA_snn_res.0.5)
#> 
#>   0   1   2   3   4   5   6   7   8 
#> 684 481 476 344 291 162 155  32  13
table(pbmc$RNA_snn_res.2)
#> 
#>   0   1   2   3   4   5   6   7   8   9  10  11  12  13  14 
#> 372 344 266 245 215 174 164 162 155 140 127  91  70  67  32 
#>  15 
#>  14

Plot a clustree to decide how many clusters you have and what resolution capture them.

library(clustree)
#> Loading required package: ggraph
clustree(pbmc, prefix = "RNA_snn_res.") + theme(legend.key.size = unit(0.05, "cm"))

Name cells with the corresponding cluster name at the resolution you pick. This case we are happy with 0.5.

# The name of the cluster is prefixed with 'RNA_snn_res' and the number of the resolution
Idents(pbmc) <- pbmc$RNA_snn_res.0.5

Plot the UMAP with coloured clusters with Dimplot

DimPlot(pbmc, label = TRUE, repel = TRUE, label.box = TRUE) + NoLegend()