Single-cell RNA sequencing (scRNA-seq) technique has enabled us to investigate deeper into the functions and characteristics of each cell, looking into the details of its genetic compositions of mRNA that are transiently active at the time of the data acquisition (1). Compared to bulk-RNA sequencing, this is a tremendous step forward that has allowed scientists to gain information about the functional differences between individual cell types (2). However, scRNA-seq loses spatial information of individual cells, interactions among adjacent cells, and local signaling networks. For example, the tumor microenvironment has remained elusive (3, 4). In order to overcome this limitation, transcriptomic techniques that capture the spatial information for tissues of interest have been actively developed (5, 6).
Spatially resolved transcriptomics has achieved significant progress in the biomedical research field with advances in imaging and next-generation sequencing (NGS) technology. Spatially resolved transcriptomics has two main categories: 1) image-based
In this review, we describe how different spatially resolved transcriptomic techniques have evolved based on the needs resulting from distinct tissue environments (Table 1) and how those methods are combined with the high-throughput scRNA-seq data to complement each method’s technical and functional constraints (Table 2).
Image-based spatially resolved transcriptomics: There are two major approaches that visualize targeted genes of interest by fluorescent labels. First,
In 2016, Shah
To understand the spatial context of cells, direct imaging of individual RNA molecules within intact cells and tissues is vital. The multiplexed fluorescence
Higher intensity of FISH signals is important to improve the gene detection performance. Signal amplification by exchange reaction (SABER) technique was designed to amplify the intensity of quantitative FISH signals (12). In brief, DNA and RNA FISH probes were first chemically synthesized with a primer sequence of their 3’ end extended into primer-exchange reaction (PER). In SABER, a multitude of PER-concatemerized probes sets can be hybridized to their targets simultaneously and read out in sequential rounds of imaging. With this approach, 18,000 probes in total targeting a 3.9-Mb region of human metaphase spreads and interphase cells were mapped to three colors, which all colocalized as expected. However, a key challenge of multiplexed FISH is that it is difficult to detect individual target probes in complex tissue environments due to the noisy background of tissue. To address this issue, Goh
Another imaging-based approach for spatially resolved transcriptomics is
Wang
Another attempt has been made with ExSeq where cDNA amplicons are eluted from the sample and re-sequenced
Capture-based spatially resolved transcriptomics: Tissues preserved in FFPE blocks can be stored from 3-10 years. Using laser capture microdissection (LCM), Romanens
Compared with the LCM-seq method, Photo-isolation chemistry has the advantage in spatial resolution because its photochemical isolation technique uses photo-caged oligo-deoxynucleotides for
Another strategy to obtain spatially resolved transcriptomic data is to use pre-arranged set of barcoded reverse transcription (RT)-primers on glass slides. Spatial Transcriptomics (ST) analyzes the transcriptome in individual tissue sections while maintaining two-dimensional positional information of tissues. Specifically, slide sections are delivered to a glass slide that bears RT-primers and sets of DNA barcodes
The techniques for spatial transcriptomics with barcoded oligonucleotide capture array described, however, have limitation in the spatial resolution of up to 55-100 μm due to the physical size of capturing spots. To resolve the issue of low spatial resolution in capture-based sequencing methodology, bead-based capturing sequencing was developed. In 2019, Rodriques
While scRNA-seq and spatially resolved transcriptomic data have limitations, a computational integration of two or more data modalities can better characterize spatial cell type compositions and local cell states in the tissues. Here, we describe the improvements benefiting from the novel
Combination with scRNA-seq: probabilistic cell typing by
To improve the low spatial resolution of capture-based spatial methods, the two data modalities from each scRNA-seq and spatially resolved transcriptomic methods were integrated into other studies. Cell type signature defined by scRNA-seq was integrated with Slide-seq data, facilitating the discovery of spatially defined gene expression patterns from mouse hippocampus. For reconstructing expression of each Slide-seq data, non-negative matrix factorization regression (NMFreg) was utilized as a combination of cell-type signatures defined in the reference scRNA-seq data (30). In addition, several computational techniques for combining single-cell and MIBI or multiple spatial transcriptomics were reported to achieve a higher cellular resolution (28, 29). Moncada
Deep learning-based spatial information: Machine learning algorithms have been extensively used as an imputation method for predicting cell types based on the context of relevant datasets. For example, DEEPsc, an artificial neural network model for categorizing cell types, was trained to predict a specific cell type with sets of genes from the scRNA-seq reference atlas dataset (37) and achieved an accuracy comparable to several existing methods (2-norm, infinity norm, mean percent difference, and large margin nearest neighbor) for applications utilizing 3,000 highly variable genes. Additionally, BayesSpace applies a Bayesian statistical method that takes the spatial gene expression profile from neighborhoods as a prior and achieves a super-resolution image (38). It is noteworthy that BayesSpace does not require separate scRNA-seq gene expression signature or preselected marker genes. SPICEMIX, another approach using non-negative matrix factorization (NMF) based on probabilistic latent variable modeling, calculates spatial affinity between the metagenes and their proportions of neighboring cells (39). Data acquired from seqFISH+ and STARmap were used to demonstrate its ability to refine the identification of cell types in mouse primary visual cortex. SpaOTsc was developed to better understand cell-cell communications in the spatially resolved transcriptomic datasets and inferred the spatial distance between every pair of cell types by computing the optimal transport distance measured by utilizing spatial measurements of a relatively small number of genes without the scRNA-seq dataset (4). The results from SpaOTsc indicate that signal sender cells exhibit more spatial localization patterns, while the locations of the signal receivers are more scattered over throughout the tissues of zebrafish and the mouse visual cortex. SpaOTsc can be used both to integrate non-spatial single-cell measurements with the spatial data and to reconstruct spatial cellular dynamics in tissues.
Deconvolution of spatially resolved transcriptomics: Owing to the lower cellular resolution of spatial barcoding capture-based methods, the proportions of discrete cell types for a given spot have been inferred by various deconvolution algorithms. Robust cell type decomposition (RCTD) is based on statistical model maximum-likelihood estimation to approximate the proportions of spatially localized cellular subtypes in spatially resolved transcriptomic data such as Slide-seq or 10X Genomics Visium datasets (40). RCTD accurately recapitulated known cell type spatial distribution in both Slide-seq and 10X Genomics Visium in the mouse brain tissues. Alternatively, SPOTlight uses NMF alongside non-negative least squares for accurate and sensitive cell-type detection, seamlessly applied in mouse brain and 22 immune subpopulations from pancreatic adenocarcinoma samples with a curated annotation (41). Recently, SpatialDWLS was shown to have a higher degree of sensitivity and accuracy than RCTD and SPOTlight using the dampened weighted least squares (DWLS) algorithm in which the weight is selected to minimize the overall relative error rate to infer cell-type composition (42). By applying SpatialDWLS to spatial transcripto-mics dataset of mouse brain and human embryonic heart, increased abundance of different cell types was observed during development. Finally, DestVI was developed to alleviate the problem of the complicated deconvolution, especially when there is a continuum of cell states that cannot be clearly distinguished (43). To address this problem, deconvolution of spatially resolved transcriptomic profiles used Variational Inference (DestVI), a Bayesian model-based multi-resolution cell- type deconvolution algorithm, was developed. DestVI learns cell type-specific profiles and sub-cell type variations using a conditional deep generative model. By combining scRNA-seq and spatially resolved transcriptomic data from the same tissue, DestVI was used to study the immune interplay within lymph nodes and explore the spatial tissue architecture of the mouse tumor microenvironment. Notably, DestVI outperforms existing discrete deconvolution approaches such as RCTD, SPOTlight, and Seurat.
Cell type inference via image-based machine learning: H&E-stained histology images are easy and cheap to obtain and routinely generated in clinics. Several studies integrated spat-ially resolved transcriptomics and histopathology images to extract feature information and make predictions by various machine learning algorithms. ST-Net implements a convolutional neural network model and was trained on 68 breast tissue sections from 23 patients with breast cancer to predict the gene expression based on histopathology images (44). In addition, HisToGene adopts Vision Transformer for image recognition and predicts super-resolution gene expression. Using the same dataset from ST-Net as a training set, HisToGene outperformed ST-Net in both gene expression prediction and clustering tissue regions accuracy (45). stLearn also integrates three types of datasets, including spatial dimensionality, tissue morphology, and genome-wide transcriptional profile using a deep learning network model (46) and predict cell type clustering, intercellular interaction, and reconstruction of spatial transition gradients, which were all successfully conducted with brain and breast cancer datasets. SpaCell and CoSTA also utilize a convolutional neural network model to predict malignant cells in prostate cancer and quantify the level of spatial expression relationships between each pair of genes from mouse brain samples, respectively (47, 48). Alternatively, STUtility uses raw 10X Genomics Visium RNA-seq and image data as input and processes the images, aligns consecutive stacked tissue images, and finally visualize them all together in 3D. Another useful functionality of STUtility includes NMF to decompose 10X Genomics Visium data into a lower dimension and a method to identify neighboring capture spots in a spatial network (49). This approach was applied in human breast cancer tissues to define the leading edge of the tumor region, eventually leading to the delineation of tumor heterogeneity between the tumor core and the tumor front. Bergenstråhle
Cellular protein information: Single-cell profiling via proteomics approach has become progressively comprehensive, and unbiased profiling of protein expression has had a broad impact in biomedical research (51). Recent advances in techniques such as epitope-based imaging, mass cytometry, and mass spectrometry enable protein expression to be mapped across tissue with high resolution. Stoeckius
Spatial ATAC-seq: To capture spatial epigenetic information in tissue at the single-cell level and genome scale, single-cell combinatorial indexing on microbiopsies that is assigned to positions for the assay for transposase accessible chromatin (sciMAP-ATAC) was introduced (55). sciMAP-ATAC produced data of similar quality to non-spatial sci-ATAC of cells within a 214-micron cubic area and was submitted to the adult mouse primary somatosensory cortex and human primary visual cortex to successfully characterize the spatial progressive nature of cerebral ischemic infarction. Integration of sciMAP-ATAC with single-nucleus RNA-seq and single-cell chromatin accessibility datasets demonstrated high concordance for most cell types. Deng
While spatially resolved transcriptomic technology offers tremendous opportunity to discover spatial heterogeneity in the disease state, characterize spatial expression blueprints during development, and elucidate spatial architecture at the molecular level, its potential lies beyond that as it is still in the early days of development. Unfortunately, none of the currently available spatially resolved transcriptomic technologies are perfect, and the choice of methods depends on study design, the biological question, and often a balance between the cell and/or transcript throughput and spatial resolution. A few overlooked caveats include the requirement for a large number of pseudocolors and barcodes in image-based techniques and a low spatial resolution in capture-based methods. This has brought new technological challenges. The integrative computational algorithms that combine the spatially resolved transcriptomic data and other data modalities have significantly contributed to not only overcoming the key challenges faced by current spatially resolved transcriptomic technologies but gaining fundamental biological insights. The development of novel computational tools will continue to play a significant role in ex-ploring large-scale spatially resolved transcriptomic datasets, translating the consequences of newly acquired spatial patterns, and elucidating principles of the underlying biology. The advent of such integrative approaches will ultimately shed light on the mechanisms that explain the essential differences in spatial architectures of healthy and diseased tissues.
The authors are grateful to Junho Song for critical reading of the manuscript. This research was supported by the National Research Foundation of Korea (NRF) grants funded by the South Korean government (2020R1F1A1076705).
The authors have no conflicting interests.
Spatially resolved transcriptomic techniques
Techniques | Features | Target genes | Application | Programming language | Reference |
---|---|---|---|---|---|
Image-based spatially resolved transcriptomics: ISH | |||||
smFISH smHCR |
Short DNA probes complementary to mRNA targets trigger chain reactions | 39 probes (smHCR) | Zebra fish, mouse brain | MATLAB | Shah |
osmFISH | Binding of 20 nucleotide-long fluorescently labeled DNA probes | 33 marker genes | Mouse brain | Python | Codeluppi |
MERFISH | Using chemical cleavage instead of photobleaching to remove fluorescent signals | 130 genes in up to 100,000 cells | Cultured U-2 OS cells | MATLAB | Moffitt |
MERFISH-based analysis platform | Targeting a set of 155 genes | Mouse hypothalamic preoptic region | MATLAB | Moffitt |
|
seqFISH+ | Enables visualization of the subcellular localization | 10,000 genes in single cells | NIH/3T3, mouse brain | MATLAB | Eng |
SABER | Additional signal amplification or applying serial imaging with DNA-Exchange | 18,000 probes targeting a 3.9-Mb region | Mouse retinal tissue | MATLAB & Python | Kishi |
Split-FISH | Alternative approach to reduce off-target background fluorescence by integrating split-probe strategy with multiplexed FISH | 317 genes in single cell | Mouse brain, liver, kidney, ovary | Python | Goh |
Image-based spatially resolved transcriptomics: ISS | |||||
STARmap | Integrated with hydrogel-tissue chemistry and targeted signal amplification | 160 to 1,020 genes simultaneously | Mouse brain | Python | Wang |
INSTA-seq | Sequences two bases simultaneously from both ends of the cDNA fragments | Up to 443,304 UMIs in total | R | Fürth |
|
HybISS | New barcoding system via sequence-by-hybridization chemistry | 119 genes for PLP design | Mouse visual cortex, human brain | MATLAB | Gyllborg |
pciSeq | Bayesian algorithm derived from scRNA-seq clusters data | Designed 755 probes for 99 genes | Mouse CA1 interneuron | Python | Qian |
ExSeq | cDNA amplicons are eluted from the sample and re-sequenced | Up to 3,039 genes with untargeted approach | Mouse brain, mouse visual cortex, human breast cancer | MATLAB | Alon |
Capture-based spatially resolved transcriptomics: LCM | |||||
exome- capture RNA- sequencing | Optimized standard protocol for hematoxylin and eosin (H&E) staining | Whole exome | 7 tumor samples of TNBC | MATLAB | Romanens |
immuno-LCM- RNAseq | RNA quality was significantly improved using modified protocol | Up to 60 cells were demonstrated to be sufficient quality | Mouse small intestine | Python | Zhang |
PIC | Photo-irradiated cells were suppressed cDNA amplification | 8,000 genes were detected with 7 × 104 unique read counts | Mouse embryo | R | Honda |
Capture-based spatially resolvedtranscriptomics: Oligonucleotide-based spatial barcode on slide | |||||
ST | Arrayed reverse transcription primers with unique positional barcodes | Up to 200 million oligonucleotides in each of 1007 features | Mouse brain, human breast cancer | R | Ståhl |
Salmén |
|||||
Multimodal analysis | Combined single-cell RNA sequencing with ST | Median depth of 1,629 UMIs/spot and 967 genes/spot | Human cSCC | MATLAB & R | Ji |
ST | Combined ST and ISS | Mean 31,283 UMIs and 6,578 unique genes per TD | AD mouse model, mouse and human brain | Python | Chen |
Capture-based spatially resolvedtranscriptomics: Oligonucleotide-based spatial barcode on bead array | |||||
Slide-seq | DNA-barcoded beads with known positions ( |
1.5 million beads, of which 770,000 can be analyzed | Mouse cerebellum and hippocampus | MATLAB & R | Rodriques |
HDST | Barcoded poly(d)T oligonucleotides into 2-μm wells with a randomly ordered bead array-based | 2,893,865 individual barcoded beads | Mouse brain, primary breast cancer | Python | Vickovic |
Slide-seq V2 | Improvements in library generation, bead synthesis and array indexing | Mean 45,772 UMIs in 110 μm diameter area | Mouse hippocampus | MATLAB & R & Python | Stickels |
Seq-Scope | Based on a solid-phase amplification using an Illumina sequencing platform | Up to 5.88-19.7 genes were identified per HDMI pixel | Human liver and colon | Python | Cho |
Stereo-seq | Combined DNA nanoball pattern arrays and tissue RNA capture | Up to 133,776 UMIs per 100 μm diameter | Mouse brain | Not identified yet | Chen |
Integration of spatially resolved transcriptomic data with other methods
Techniques | Features | Input data | Application | Programming language | Reference |
---|---|---|---|---|---|
Combination with scRNA-seq | |||||
pciSeq | Bayesian algorithm derived from scRNA-seq clusters data with ISS | Designed 755 probes for 99 genes | Mouse CA1 interneuron | Python | Qian |
seqFISH | Computing the ratio of the performance and prediction scores with scRNA-seq data | Each cell contained avg 196 mRNA from 93.2 genes | Embryo development in brain and gut | R | Lohoff |
Multiple spatial transcriptomics | Unbiased approach with additional in situ hybridization using RNAscope and multi-molecule ISS | Mean 31,283 UMIs and 6,578 unique genes per tissue domain | AD mouse model, mouse and human brain | Python | Chen |
Slide-seq | NMFreg that reconstructs expression of each cell type signatures defined by scRNA-seq | 1.5 million beads, of which 770,000 could be analyzed | Mouse cerebellum and hippocampus | MATLAB & R | Rodriques |
Integrating microarray-base d ST and MIA | Enrichment analysis that two-tailed Student’s t-test were used to compare expression of those marker genes | 2,500-3,300 UMIs and 1,400-1,700 unique expressed genes per single cell | Pancreatic ductal adeno-carcinoma | R | Moncada |
Deep learning-based spatial information | |||||
DEEPsc | Deep-learning network was trained with spatial position feature vectors as simulated scRNA-seq data | Started with top 3,000 highly variable genes | MATLAB | Maseda |
|
BayesSpace | Bayesian statistical method that uses the information from spatial neighborhoods to achieve super-resolution images | 10X Genomics Visium data, does not require independent single-cell data or marker gene preselection | Brain, melanoma, invasive ductal carcinoma, ovarian adeno-carcinoma | R | Zhao |
SPICEMIX | Enhances the NMF of gene expression with a graphical representation of the spatial relationship of cells | 2,470 genes in 523 cells (seqFISH+), 930 cells and 1,020 genes (STARmap) | Mouse primary visual cortex | Python | Chidester |
SpaOTsc | Infer the spatial distance between every pair of cells by computing the optimal transport distance | 851-15,413 cells and 10,495-45,789 genes (scRNA-seq), 64-1,549 spatial positions and 47-1,020 genes | Zebrafish embryo, |
Python | Cang |
Deconvolution of spatially resolved transcriptomics | |||||
RCTD | Statistical model assumed to be Poisson distributed and maximum-likelihood estimation (MLE) used to infer the cell types | Slide-seq and 10X Genomics Visium data | Mouse brain | R | Cable |
SPOTlight | NMF along with non-negative least squares (NNLS) model with both the basis and coefficient matrices with cell type marker genes | 41,986 cells were merged to identify a total of 10,623 immune cells | Mouse brain, pancreatic adeno-carcinoma | R | Elosua-Bayes |
SpatialDWLS | Dampened weighted least squares (DWLS) model with cell-type specific gene signatures from a public scRNA-seq dataset as a reference | 10,000 genes in 523 cells (seqFISH+) | Mouse brain, human heart | R | Dong |
DestVI | Bayesian model for multi-resolution deconvolution of cell types using Variational Inference | Pair of ST and scRNA-seq from same tissue | Murine lymph node, mouse tumor model | Python | Lopez |
Cell type inference via image-based machine learning | |||||
ST-Net | Deep learning algorithm that combines ST and histology images to predict the target gene expression of each spot | 30,612 spots in 68 breast tissue sections | Breast cancer | Python | He |
HisToGene | Employs a modified Vision Transformer model for gene expression prediction from histology images | 9,612 spots and 785 genes in breast cancer tissue | Breast cancer | PyTorch | Pang |
stLearn | Deep neural network model to predict hotspots where cell-cell interactions are more likely to occur | Feature vectors from H&E images of the tissue section | Mouse brain, human brain, breast cancer | Python | Pham |
SpaCell | Normalized count data and H&E staining images were trained with convolutional neural network | Tissue morphology and spatial gene expression data | Prostate cancer, amyotrophic lateral sclerosis | Python | Tan |
CoSTA | Clustering by Gaussian mixture model (GMM) and weight updating as commonly performed in training neural networks | Image-type matrix of MERFISH and Slide-seq data | Mouse brain, brain Injury | Python | Xu |
STUtility | NMF to decompose ST data and identification and extraction of neighbouring capture-spots | 10x Genomics Visium data | Mouse brain, breast cancer tissue, lymph node, rheumatoid arthritis | R | Bergenstråhle |
ISST | Image-based |
12 sections from the mouse olfactory bulb | Mouse olfactory bulb, human breast cancer | Python | Bergenstråhle |
Cellular protein information | |||||
CITE-seq | Oligonucleotide-labeled antibodies are used to integrate cellular protein and transcriptome measurements | Common immune subpopulation markers (CD8a, CD3e, CD19, CD56, CD16, CD11c and CD14) | Human HeLa, mouse 4T1 cell, immune subpopulation | R | Stoeckius |
IMC | Epitope-based imaging methods that employ a mass spectrometer for readout to infer RNA-to-protein correlations | Detected three mRNA simultaneously (HER2, CK19 and CXCL10) | Breast cancer | MATLAB & R & Python | Schulz |
nanoPOTS | Unique proteins were identified via combination of LCM and ultrasensitive nanoLC-MS/MS | > 2,000 proteins with 100 μm spatial resolution | Mouse luminal epithelial cell, stromal cell, glandular epithelial cell | R | Piehowski |
Spatial ATAC-seq | |||||
sciMAP-ATAC | Spatially resolved, single-cell profiling of chromatin states from a single tissue punch | Mean 12,052 - 30,212 passing reads per cell | Mouse and human brain, cerebral ischemia model system | R | Thornton |
Spatial-ATAC-seq | DNA barcode solutions were introduced to the tissue surface using an array of microchannels | 36,303-100,786 unique fragments per pixel | Mouse embryos, human tonsil tissue | R | Deng |