9 Introduction to Part 2

9.1 Learning objectives

By the end of this lesson, you will be able to:

  • Recall the experimental design and analysis goal
  • Identify where this information is in the data

9.2 Overview

Analysis goal: Identify molecular differences between stimulated vs. stimulated peripheral blood mononuclear cells (PBMCs).

PBMCs are stimulated in vitro with IFN-beta. This is expected to trigger the up or downregulation of immune pathways. In our data, we can measure this directly by comparing the gene expression levels of immune-related genes.

We have the following information in our Seurat object metadata:

  • ind identifies a cell as coming from one of 8 individuals.

  • stim identifies a cell as control or stimulated with IFN-beta.

  • cell contains the cell types identified by the creators of this data set.

  • multiplets classifies cells as singlet or doublet.

  • Restart session - removes old saved data in the environment

Read in libraries

library(qs2)
#> qs2 0.2.1
library(Seurat)
library(harmony)
#> Loading required package: Rcpp
library(tidyverse)
#> ── Attaching core tidyverse packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 2.0.0 ──
#> ✔ forcats   1.0.1     ✔ stringr   1.6.0
#> ✔ lubridate 1.9.4     ✔ tibble    3.3.0
#> ✔ purrr     1.2.0     ✔ tidyr     1.3.1
#> ✔ readr     2.1.6
#> ── Conflicts ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(SingleCellExperiment)
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#> 
#> Attaching package: 'matrixStats'
#> 
#> The following object is masked from 'package:dplyr':
#> 
#>     count
#> 
#> 
#> Attaching package: 'MatrixGenerics'
#> 
#> The following objects are masked from 'package:matrixStats':
#> 
#>     colAlls, colAnyNAs, colAnys, colAvgsPerRowSet,
#>     colCollapse, colCounts, colCummaxs, colCummins,
#>     colCumprods, colCumsums, colDiffs, colIQRDiffs,
#>     colIQRs, colLogSumExps, colMadDiffs, colMads,
#>     colMaxs, colMeans2, colMedians, colMins,
#>     colOrderStats, colProds, colQuantiles, colRanges,
#>     colRanks, colSdDiffs, colSds, colSums2,
#>     colTabulates, colVarDiffs, colVars,
#>     colWeightedMads, colWeightedMeans,
#>     colWeightedMedians, colWeightedSds,
#>     colWeightedVars, rowAlls, rowAnyNAs, rowAnys,
#>     rowAvgsPerColSet, rowCollapse, rowCounts,
#>     rowCummaxs, rowCummins, rowCumprods, rowCumsums,
#>     rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#>     rowMadDiffs, rowMads, rowMaxs, rowMeans2,
#>     rowMedians, rowMins, rowOrderStats, rowProds,
#>     rowQuantiles, rowRanges, rowRanks, rowSdDiffs,
#>     rowSds, rowSums2, rowTabulates, rowVarDiffs,
#>     rowVars, rowWeightedMads, rowWeightedMeans,
#>     rowWeightedMedians, rowWeightedSds,
#>     rowWeightedVars
#> 
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> Loading required package: generics
#> 
#> Attaching package: 'generics'
#> 
#> The following object is masked from 'package:lubridate':
#> 
#>     as.difftime
#> 
#> The following object is masked from 'package:dplyr':
#> 
#>     explain
#> 
#> The following objects are masked from 'package:base':
#> 
#>     as.difftime, as.factor, as.ordered, intersect,
#>     is.element, setdiff, setequal, union
#> 
#> 
#> Attaching package: 'BiocGenerics'
#> 
#> The following object is masked from 'package:dplyr':
#> 
#>     combine
#> 
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> 
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, aperm, append, as.data.frame,
#>     basename, cbind, colnames, dirname, do.call,
#>     duplicated, eval, evalq, Filter, Find, get, grep,
#>     grepl, is.unsorted, lapply, Map, mapply, match,
#>     mget, order, paste, pmax, pmax.int, pmin,
#>     pmin.int, Position, rank, rbind, Reduce,
#>     rownames, sapply, saveRDS, table, tapply, unique,
#>     unsplit, which.max, which.min
#> 
#> Loading required package: S4Vectors
#> 
#> Attaching package: 'S4Vectors'
#> 
#> The following objects are masked from 'package:lubridate':
#> 
#>     second, second<-
#> 
#> The following object is masked from 'package:tidyr':
#> 
#>     expand
#> 
#> The following objects are masked from 'package:dplyr':
#> 
#>     first, rename
#> 
#> The following object is masked from 'package:utils':
#> 
#>     findMatches
#> 
#> The following objects are masked from 'package:base':
#> 
#>     expand.grid, I, unname
#> 
#> Loading required package: IRanges
#> 
#> Attaching package: 'IRanges'
#> 
#> The following object is masked from 'package:lubridate':
#> 
#>     %within%
#> 
#> The following object is masked from 'package:purrr':
#> 
#>     reduce
#> 
#> The following object is masked from 'package:sp':
#> 
#>     %over%
#> 
#> The following objects are masked from 'package:dplyr':
#> 
#>     collapse, desc, slice
#> 
#> Loading required package: Seqinfo
#> Loading required package: Biobase
#> Welcome to Bioconductor
#> 
#>     Vignettes contain introductory material; view
#>     with 'browseVignettes()'. To cite Bioconductor,
#>     see 'citation("Biobase")', and for packages
#>     'citation("pkgname")'.
#> 
#> 
#> Attaching package: 'Biobase'
#> 
#> The following object is masked from 'package:MatrixGenerics':
#> 
#>     rowMedians
#> 
#> The following objects are masked from 'package:matrixStats':
#> 
#>     anyMissing, rowMedians
#> Warning: replacing previous import
#> 'S4Arrays::makeNindexFromArrayViewport' by
#> 'DelayedArray::makeNindexFromArrayViewport' when loading
#> 'SummarizedExperiment'
#> 
#> Attaching package: 'SummarizedExperiment'
#> 
#> The following object is masked from 'package:Seurat':
#> 
#>     Assays
#> 
#> The following object is masked from 'package:SeuratObject':
#> 
#>     Assays
library(SingleR)
library(celldex)
#> Warning: replacing previous import
#> 'S4Arrays::makeNindexFromArrayViewport' by
#> 'DelayedArray::makeNindexFromArrayViewport' when loading
#> 'HDF5Array'
#> 
#> Attaching package: 'celldex'
#> 
#> The following objects are masked from 'package:SingleR':
#> 
#>     BlueprintEncodeData,
#>     DatabaseImmuneCellExpressionData,
#>     HumanPrimaryCellAtlasData, ImmGenData,
#>     MonacoImmuneData, MouseRNAseqData,
#>     NovershternHematopoieticData
library(patchwork)

Read in the preprocessed Seurat object from part 1. We use qs2 to efficiently load the data in.

seurat_object <- qs2::qs_read("data/seurat_object_preprocessed.qs2")

Exercise

Use names(seurat_object@meta.data) to identify where this information is stored in the Seurat object.

# Add your code here

Exercise

Use the table(seurat_object$<name>) function to see the number of cells that belong to each metadata category. Replace <name> with the relevant metadata names (e.g. ind, stim)

# Add your code here

It is useful to display the number of control and stimulated cells per individual

table(seurat_object$ind, seurat_object$stim)
#>       
#>        ctrl stim
#>   101   175  226
#>   107   115  106
#>   1015  503  489
#>   1016  335  348
#>   1039   87  100
#>   1244  373  311
#>   1256  384  385
#>   1488  406  534

Takeaways:

  • Each cell in our Seurat object is annotated either as the control or stimualted cell
  • Each cell notes the individual the cells were derived from
  • This is important so we know which cells to compare in the data