5 QC Filtering
Tutorial: https://satijalab.org/seurat/articles/pbmc3k_tutorial#qc-and-selecting-cells-for-further-analysis
5.1 QC and selecting cells for further analysis
Why do we need to do this?
Low quality cells can add noise to your results leading you to the wrong biological conclusions. Using only good quality cells helps you to avoid this. Reduce noise in the data by filtering out low quality cells such as dying or stressed cells (high mitochondrial expression) and cells with few features that can reflect empty droplets.
Challenge: The meta.data slot in the Seurat object
Where are QC metrics stored in Seurat?
- The number of unique genes and total molecules are automatically calculated during
CreateSeuratObject()
- You can find them stored in the object meta data
What do you notice has changed within the
meta.data
table now that we have calculated mitochondrial gene proportion?Imagine that this is the first of several samples in our experiment. Add a
samplename
column to to themeta.data
table.
Challenge: Filter the cells
Apply the filtering thresolds defined above.
- How many cells survived filtering?
The PBMC3k dataset we’re working with in this tutorial is quite old. There are a number of other example datasets available from the 10X website, including this one - published in 2022, sequencing 10k PBMCs with a newer chemistry and counting method.
- What thresholds would you chose to apply to this modern dataset?