The study

This workshop will use CosMx Spatial Molecular Imager (SMI) data from the paper “Macrophage and neutrophil heterogeneity at single-cell spatial resolution in human inflammatory bowel disease” (Garrido-Trigo et al. 2023) - link: https://www.nature.com/articles/s41467-023-40156-6

There were 9x colon tissue samples, one per slide. They used a 1k RNA panel panel, (5k Xenium, 6k Cosmx and whole transcriptome kits are also out there).

For this workshop, we will work with a subsetted dataset;

The data object

This is a SpatialFeatureExperiment object. (Visit https://pachterlab.github.io/SpatialFeatureExperiment/articles/SFE.html for more information on this class.)

This subsetted dataset has 999 genes and 65601 cells.

sfe
## class: SpatialFeatureExperiment 
## dim: 999 65601 
## metadata(0):
## assays(3): counts molecules logcounts
## rownames(999): AATK ABL1 ... NegPrb22 NegPrb23
## rowData names(3): target CodeClass hvg
## colnames(65601): HC_a_1000_1 HC_a_1000_2 ... CD_c_99_4 CD_c_9_2
## colData names(40): fov cell_ID ... clust_M0_lam0.6_k50_res0.3 niche
## reducedDimNames(2): PCA UMAP
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : CenterX_global_px CenterY_global_px
## imgData names(4): sample_id image_id data scaleFactor
## 
## unit: full_res_image_pixel
## Geometries:
## colGeometries: centroids (POINT), cellSeg (POLYGON) 
## 
## Graphs:
## GSM7473682_HC_a: 
## GSM7473683_HC_b: 
## GSM7473684_HC_c: 
## GSM7473688_CD_a: 
## GSM7473689_CD_b: 
## GSM7473690_CD_c:

It has the following metadata for each cell.

DT::datatable(as.data.frame(head(colData(sfe), n=300)))

This was a ‘1k’ panel, we have 999 targets.

DT::datatable(as.data.frame(rowData(sfe)))

How many cells per sample?

table(sfe$tissue_sample)
## 
##  CD_a  CD_b  CD_c  HC_a  HC_b  HC_c 
##  6723 13150 12929  8225 15642  8932

This data is subsetted to only 4 ‘Fields of View’ (FOV) per sample. On the CosMx platform, these are multiple rectangular regions that make up the run. Essentially we’re looking at a corner of each sample.

#NB: ColData is a DataFrame, not a data.frame, often need an explicit conversion
colData(sfe) %>% as.data.frame() %>% select(group,tissue_sample, fov, fov_name) %>%
  group_by(group,tissue_sample, fov, fov_name) %>% 
  summarise(n_cells = n())
## `summarise()` has grouped output by 'group', 'tissue_sample', 'fov'. You can
## override using the `.groups` argument.

The existing annotation

We’ll be using 5 broad cell types. These are from Garrido-Trigo et al’s original paper.

plotUMAP(sfe, colour_by='celltype_subset', scattermore=1)

Lets check them out on the actual tissue, one of the healthy control samples.

A note on the tissue morphology: Here the top would be the lumen of the colon, and a stromal layer at the bottom known as the lamina propria. The oval-shaped epithelial structures are crypts. See: https://www.pathologyoutlines.com/topic/colonhistology.html

plotSpatialFeature(sfe.sample.HC, 'celltype_subset', colGeometryName = "cellSeg") + 
  theme(legend.title=element_blank()) +
  ggtitle(sample)

There are multiple levels of cell type annotation in this dataset.

There is the very (very) detailed celltype_singleR2, used for various analyses in the original paper;

# Remove cell types with less than 30 instances, purely for plotting.
cell_counts <- table(sfe.sample.HC$celltype_SingleR2)
sfe.sample.HC$filtered_celltype_singleR2 <- as.character(ifelse(cell_counts[as.character(sfe.sample.HC$celltype_SingleR2)] >= 30 , as.character(sfe.sample.HC$celltype_SingleR2), "Other"))

plotSpatialFeature(sfe.sample.HC, 'filtered_celltype_singleR2', colGeometryName = "cellSeg") + 
  theme(legend.title=element_blank()) +
  ggtitle(sample)

And some unlabelled clusters generated purely on transcriptional similarity. These might represent a nice level of classification if were were doing the analysis from scratch.

plotSpatialFeature(sfe.sample.HC, 'cluster_code', colGeometryName = "cellSeg") + 
  theme(legend.title=element_blank()) +
  ggtitle(sample)

Of course, we can plot any gene’s expression (so long as its present on the 1k panel!)

plotSpatialFeature(sfe.sample.HC, 'PIGR', colGeometryName = "cellSeg") + 
  theme(legend.title=element_blank()) +
  ggtitle(sample)