
In prokaryotes, the CRISPR/Cas system serves as an adaptive immune mechanism, allowing these organisms to detect and silence foreign DNA through RNA-guided nuclease activity (1). In such systems, Cas proteins function in three main phases, namely adaptation, expression, and interference. Initially, they play an essential role in recognizing and integrating exogenous DNA into CRISPR loci, resulting in the synthesis of short CRISPR RNAs (crRNAs) that direct the Cas nuclease as a guide RNA (gRNA) to specifically target and cleave complementary invasive DNA (2). Next, the process relies on RNA−DNA pairing, allowing the Cas nuclease to bind effectively to its target for precise degradation (3, 4).
CRISPR/Cas systems can be broadly classified into two categories and further subdivisions, based on the composition of effector proteins: Class 1 (Types I, III, and IV) involves multiple Cas proteins for interference, while Class 2 (Types II, V, and VI) relies on a single multi-domain Cas protein (5, 6). During the interference phase, these systems proceed through target search, recognition, and degradation. Type I CRISPR/Cas systems typically operate in a two-step process for target degradation. First, the Cascade complex identifies the complementary DNA, then the helicase–nuclease fusion protein Cas3 is recruited to perform the subsequent cleavage step (7). In contrast, Type II and V CRISPR/Cas systems initiate cleavage-ready complex has been assembled, without requiring additional recruitment.
To elucidate these mechanisms, numerous studies employing ensemble averaging approaches have been performed. However, these approaches often obscure individual molecular characteristics, making it difficult to accurately comprehend inter-molecular interactions. To facilitate such understanding, single-molecule biophysics aims to directly observe the localization, dynamics, interactions, and reactions of individual molecules, providing insights into their distinct characteristics, heterogeneity within a population of molecules, and detailed mechanisms (8).
Single-molecule techniques include fluorescence-based methods like single-molecule Förster resonance energy transfer (smFRET) (9), which is useful for studying dynamics at sub-ten nanometer scales; single-particle tracking (SPT) (10), which enables the observation of movements spanning tens of nanometers to several micrometers; and super-resolution microscopy, such as photoactivated localization microscopy (PALM) (11) and stochastic reconstruction microscopy (STORM) (12), which offer resolution improvements for the spatial separation of tens of nanometers. Additionally, force-based methods, such as optical tweezers (13), magnetic tweezers (14), and atomic force microscopy (AFM) (15), are effective in measuring and manipulating forces in the range of a few to hundreds piconewtons.
In this review, we discuss single-molecule approaches that have proposed a deeper and more detailed understanding of the CRISPR/Cas system focusing on the most thoroughly studied DNA-targeting systems: Type I, represented by CRISPR/Cascade (CRISPR-associated complex), and Type II, exemplified by CRISPR/Cas9, and Type V, including CRISPR/Cas12, covering each step of the process in detail (Fig. 1).
Efficient target search is a critical process for the function of DNA/RNA-related proteins, including polymerases, restriction enzymes, and CRISPR/Cas nucleases (16). Regardless of their specific roles, successful target search and recognition must be both fast and highly specific. The most effective approach involves a combination of 3D and 1D diffusion, termed facilitated diffusion (17), while minimizing interactions with non-target sites.
Single-molecule approaches have provided valuable insights into how these systems find and interact with their targets, revealing complex diffusion mechanisms that contribute to search efficiency. In this section, we examine the single-molecule studies on the target searching mechanisms of commonly used Type II Cas nucleases, such as Cas9 and Cas12, as well as Type I Cascade systems.
Most DNA-targeting Cas nucleases require a short sequence motif, known as a protospacer adjacent motif (PAM) site, adjacent to the target sequence, to accurately distinguish their target within the genome. Hence, locating a PAM site is a prerequisite for the effective functioning of CRISPR/Cas systems, as well as a key step in the target search process.
Sternberg et al. employed the “DNA curtain” technique, using arrays of long DNA molecules to directly observe Cas9 searching for target sequences (18). Using a non-invasive, genetically encoded FLAG tag and anti-FLAG conjugated quantum dots, they observed Cas9 binding to bacteriophage λ DNA templates. This method showed that Cas9 locates its target primarily through random 3D collisions; this was further supported by a study with AFM (19). However, these results stand in contrast to other studies showing that Cas9 may exhibit short-range sliding or hopping, particularly in the presence of gRNA (20, 21).
A single-molecule FRET study revealed two distinct binding modes for Cas9: a specific mode, driven by PAM and gRNA−DNA base pairing; and a sampling mode, where Cas9 scans for PAM sites, without fully engaging with the DNA (22). This sampling behavior, combined with lateral diffusion between adjacent PAMs, enhances the ability of Cas9 to find its target. To further improve the target search efficiency, several studies have engineered Cas9 variants to promote 1D diffusion. One SPT “DNA garden” study developed two approaches that involved introducing mutations to reduce interactions between Cas9 and DNA, and attaching a tail from the “sliding-promoting” protein Nhp6A, to result in an eightfold improvement in the sliding capability of Cas9 (23, 24). Moreover, high-throughput tracking of Cas9 mutants along DNA showed that these engineered variants exhibited more efficient sliding dynamics, compared to the wild-type Cas9.
Yang et al. recently demonstrated that Cas9 uses both 3D diffusion and 1D diffusion to locate its target sequences, combining biochemical and single-molecule fluorescence assays (24). In particular, the study highlighted that Cas9 searches for targets more efficiently on DNA regions flanking PAM sites, with asymmetric search regions observed downstream of the PAM under physiological salt conditions. Combining optical tweezers-based assay and smFRET, specific lysine residues (Lys1151-1156) on Cas9 influence interactions with DNA (25). Mutating these residues disrupted the 1D diffusion mechanism, significantly reducing search efficiency, suggesting that non-specific DNA contacts play a critical role in guiding Cas9 to its target. Several single-molecule investigations of catalytically inactive Cas9 (dCas9) in living cells have been conducted (26, 27). The study by Knight et al. (28) tracked Cas9 fused with HaloTag in live mouse fibroblast cells in real-time, revealing that Cas9 mainly relies on 3D diffusion during its DNA search. This study also underscores the pivotal role of the PAM sequence in facilitating Cas9 binding to DNA. Mismatches between the gRNA and target DNA near the PAM disrupted the binding between Cas9–gRNA complex and DNA, while mismatches farther from the PAM were more tolerated, reinforcing the importance of PAM-proximal sequences in target recognition. Additionally, the study examined the interaction of Cas9 with chromatin by tagging heterochromatin regions with eGFP-labeled heterochromatin protein 1. Within these densely packed regions, Cas9–gRNA complex exhibited slower diffusion and reduced search trajectories, likely due to the compacted chromatin structure. However, despite these challenges, Cas9 was still able to locate target sites within heterochromatin, suggesting that specific mechanisms assist Cas9 in navigating such a crowded environment. When comparing Cas9 to the Transcription Activator-Like Effector Nucleases (TALEN) system, Cas9 was significantly shown to be up to five times slower in heterochromatin, likely because of its tendency toward nonspecific local searches in these regions (29). While Cas9 and TALEN employ a mix of 3D diffusion and localized searches to find their targets, the precision of these localized searches plays a key role in determining the success of genome editing. These findings highlight the importance of chromatin context in guiding CRISPR/Cas9 systems toward their target sequences, and ensuring efficient gene editing.
Similarly, Cas12, another Class 2 CRISPR/Cas protein, also utilizes both 3D and 1D diffusion during the target search process (30). Under physiological conditions, Acidaminococcus sp. Cas12a (AsCas12a) exhibits a 1D hopping behavior to search for targets on a long DNA, with a diffusion coefficient of (1.50 ± 0.12)×107 bp2/s. The 1D hopping behavior was also observed on a stretched DNA in a study using optical tweezers (31). Interestingly, the initial binding position of AsCas12a correlates with PAM density, suggesting that PAM recognition enables AsCas12a to adopt a conformation that is compatible with 1D diffusion. Another single-molecule fluorescence assay demonstrated that Cas12a is further guided by nonspecific DNA interactions downstream of the PAM site, mediated by a conserved positively charged alpha helix within the REC2 domain during target search (32). Additionally, a smFRET study on CasX (Cas12e) revealed lower target search efficiency, compared to Cas9 and Cas12a (33). During nonspecific binding to DNA, CasX displays slower diffusion rates, and requires multiple encounters with target sites before stable binding occurs. Olivi et al. (34) provided further insights, in tracking the movements in E. coli cells in their study of catalytically inactive variants Cas12a (dCas12a) fused with the photoactivatable fluorescent protein PAmCherry2.1. They found dCas12a to be both faster and more efficient in target searching than dCas9, particularly when associated with a targeting gRNA.
Magnetic tweezer experiments have shown that the facilitated diffusion behavior is also present to search for target sites in Type I Cascade complexes (35, 36). Although earlier studies suggested that Cascade relies solely on 3D diffusion, more recent evidence indicates that it can engage in 1D sliding along DNA, particularly when interacting with PAM-containing regions. Single-molecule fluorescence studies have demonstrated short dwell times (<1 s) for Cascade on PAM sites, with longer binding times observed when more PAMs are present, supporting a role for 1D diffusion in the search process (37). Interestingly, Cascade’s target search efficiency can be modulated by DNA supercoiling, which during target recognition influences the formation and locking of R-loops. Under conditions of negative supercoiling, Cascade revisits target sites multiple times, enhancing overall search efficiency. In one single-molecule in vivo experiment with Type I Cascade complexes, PAM interaction times as short as ∼30 ms were reported, although 1D diffusion was not directly observed (38). This suggests that DNA topology plays an important role in guiding Cascade to its target, further emphasizing the complexity of CRISPR/Cas search mechanisms.
In the CRISPR/Cas systems, the PAM sequence in the target DNA plays an essential role in differentiating between self and non-self DNA, enabling the Cas proteins to bind specifically to their target sites (39). However, the function of the PAM may vary across different types of CRISPR/Cas proteins. Upon recognition of the PAM sequence, the formation of an R-loop occurs, leading to the establishment of a cleavage-competent complex at the target site. In this section, we focus on single-molecule studies investigating PAM recognition and R-loop formation by CRISPR/Cas systems, highlighting the intricate mechanisms that govern target recognition.
PAM site recognition by CRISPR/Cas systems represent a critical initial step following the target search process, this interaction between protein and DNA being inherently complex. Upon binding to the canonical PAM, the Cas protein undergoes a conformational change that induces local melting of DNA at the nucleation site adjacent proximal to the PAM. This initiates the directional formation of the RNA-DNA hybrid and displacement of the non-target DNA strand, a process known as R-loop formation (34).
Experiments using DNA curtains have further shown that PAM recognition is mediated by intrinsically weak interactions (18). While Cas9 maintains stable binding to bona fide target sites, its interactions with off-target sites are transient. The distribution of off-target binding correlates with the PAM distribution on λ phage DNA, consistent with other studies showing that in vivo, Cas9 spends less than a second at PAM sites. Interestingly, some Cas9 molecules exhibit stable binding for over five seconds, even in the absence of a target sequence, suggesting that Cas9 may be searching for adjacent PAM sites near a cognate target sequence, potentially via lateral diffusion. Biochemical assays have added further complexity, demonstrating that Cas9 can bind to DNA substrates lacking a target sequence but containing multiple PAM sites, as seen in electrophoretic mobility shift assays (18, 40). This phenomenon may be explained by local diffusion along the DNA strand, generating a synergistic effect between neighboring PAM sites. Throughout this process, Cas9 undergoes substantial conformational changes during target search. The primary conformational shift occurs upon gRNA binding, enabling Cas9 to search for PAMs in a sequence-specific manner (34). Moreover, a magnetic tweezers assay involving Streptococcus thermophilus Cas9 (StCas9) demonstrated that due to kinetic instability, single nucleotide mutations in the PAM region significantly impair or abolish target binding and R-loop formation (41).
In contrast, experiments with Cas12a ribonucleoprotein (RNP) show that its initial binding positions during diffusion are positively correlated with PAM density, underscoring the importance of PAM recognition for 1D diffusion by AsCas12a (30). However, once the target sequence is unwound, Cas12a proceeds to DNA cleavage independently of PAM recognition, indicating that the absence of PAM does not hinder cleavage. This stands in stark contrast to Cas9, which for strong target binding requires PAM recognition, even in the presence of bubble DNA. Thus, unlike Type I Cascade and Type II Cas9, the role of PAM in Cas12a becomes negligible after DNA unwinding during R-loop initiation.
For the Type I Cascade complex, PAM recognition is more promiscuous, with at least five distinct interfering PAM sequences identified for E. coli Cascade (42). PAM is recognized by the Cas8 (Cse1) subunit in its double-stranded form, specifically via the minor groove of DNA. This mode of minor-groove recognition suggests that if the target sequence is optimal, mutated PAM sequences can still be tolerated. A DNA curtains study further supports this, demonstrating that even in the absence of a PAM, Cascade can bind to a fully matching protospacer, though at a significantly reduced binding rate (43). Nevertheless, an additional PAM-checking step occurs after R-loop formation during DNA cleavage by Type I Cascade, indicating that PAM continues to influence target site selection.
When Cas9 binds to the correct PAM, it induces DNA bending, allowing the duplex to unzip for interrogation (44). The R-loop formation of Cas9 is a critical step in its function, but while stable binding to the seed region with (9-10) bp is sufficient for initial engagement, successful cleavage requires more precise Watson–Crick base pairing, which ensures stable binding and subsequent dynamic conformational changes of Cas9 (45). This dynamic process of reversible R-loop formation allows Cas9 to avoid becoming trapped at partially matching sites, facilitating its ability to efficiently scan through potential targets during the search process (35, 46). The R-loop formation of Cas9 RNP was monitored with magnetic tweezers assay, affected by gRNA modification (47). smFRET assays have elucidated the sub-conformations of the RNA-DNA heteroduplex during R-loop expansion (48) and tracked the dynamics of DNA hybridization for various Cas9 variants, revealing how on-target and off-target sites differ in the formation of R-loops (49). Single-molecule supercoiling experiments with near base-pair resolution have also detected short-lived R-loop intermediates at off-target sites with single mismatches (41).
Similar to Cas9, Cas12 begins to form R-loops after PAM binding. As R-loop formation expands, conformational changes need to be passed so that the catalytic pocket is ready to bind any ssDNA. However, Cas12a requires a longer 17 bp seed for stable binding (30, 50, 51), compared to the shorter seed regions required by Type I Cascade (52) and Type II Cas9 of (7–8) and (9–10) bp, respectively. A magnetic tweezer assay reported that during downstream DNA breathing, a conserved aromatic residue W355, found in the REC2 subdomain, interacts with the terminus of the R-loop by preventing its extension (53). Optical tweezer-based single-molecule assay detected that Nuc domain in Cas12a can stabilize the R-loop complex via direct interaction with non-target DNA (51). Moreover, R-loop formation for Lachnospiraceae bacterium Cas12a was measured, indicating that Cas12a is more sensitive to torque than Cas9; with R-loop formation and dissociation happening easily at low torque, it may be less tolerant of mismatches (54).
Cascade also forms a R-loop directionally after binding to a PAM site, similar to Cas9 (35). In magnetic tweezer experiments, R-loop formation is sensitive to seed region mutations, requiring much higher negative supercoiling, compared to mutations further from the PAM. Cascade often stalls at seed mutations, needing four times more supercoiling than wild-type sequences. DNA curtain assays indicate multiple Cascade attempts before stable engagement and complete R-loop formation (43). While magnetic tweezers-based assay showed that R-loop formation primarily occurs in a directional fashion (41), single-molecule FRET studies revealed Cascade binds DNA both directionally and nondirectionally. On fully complementary targets, Cascade shows long-lasting interactions or short-lived partial unwinding. In contrast, seed region mutations result in brief binding events, significantly impeding R-loop formation. Unlike Cas9, Cascade lacks a robust conformational proofreading mechanism, which may explain its ability to bind targets with mismatches, or even without a PAM. However, once Cascade binds to a fully complementary target, it locks into place, stabilizing the R-loop, which is a critical step in recruiting Cas3 nuclease for target degradation.
Successful PAM site identification and stable R-loop formation are essential prerequisites to activate the nuclease activity of CRISPR/Cas systems. While Cas9, Cas12, and Cascade proteins all target DNA for degradation, they employ distinct mechanisms to achieve this goal. Here, we examine current advances in single-molecule techniques to understand that the nuances of these processes are crucial, as each system utilizes different strategies to recognize and cleave target sequences.
Most single-molecule investigations involving the Cas9 system have primarily focused on the dCas9; however, Josephs et al. (55) identified subtle differences in the conformational changes associated with target binding between dCas9 and active Cas9, which may result from additional conformational alterations during cleavage. Subsequent structural and single-molecule studies have shown that as the complementarity between the gRNA and target DNA increases, Cas9 undergoes an additional conformational change (56). Initially, the HNH domain responsible for cleaving the target strand is located at the PAM-distal end, positioned far from the cleavage site. However, once a complete target sequence is identified, the HNH domain translocates to the cleavage site, achieving a catalytically active conformation (44). Employing bulk FRET experiments, Sternberg et al. (57) investigated a conformational shift in the HNH nuclease domain of Cas9 serving as a vital layer of protection against off-target cleavage. Both single-molecule and bulk FRET experiments have demonstrated that for Cas9 to successfully cleave the target, at least 18 out of 20 nucleotides between the target and guide must be complementary (56, 58, 59). Another nuclease domain, RuvC, facilitates the cleavage of the non-target strand by initially positioning close to the cleavage site, requiring the translocation of the HNH domain to complete the cleavage (44, 60).
Crucially, single-molecule studies have discovered the role of different divalent ions, particularly magnesium (Mg2+), in the target cleavage by Cas9 (61). According to those studies, Mg2+ is necessary to stabilize the interaction between the HNH and RuvC domains, and specifically, to achieve target cleavage efficiently. Compared to other divalent ions, Mg2+ is the most physiologically relevant ion, as it allows for the real-time characterization of the cleavage dynamics, and the identification of rate-limiting steps.
Using DNA curtains and smFRET assays, Cas9 had been observed to remain bound to target DNA after cleavage, lasting several minutes or more (18, 22, 62). The stability of various Cas9 variants was also investigated through the binding following cleavage under supercoiling conditions using single-molecule magnetic tweezers (63). Specifically, after the initial cleavage of the target strand, the release of supercoils occurs only after the collapse of the R-loop. Conversely, when the non-target strand is cleaved, supercoils are released gradually through the swiveling motion of the non-target strand around the Cas9 protein that remains bound to the target strand. Consequently, both Cas9 and its non-target strand nicking mutant maintain stable attachment to the DNA for prolonged duration. Engineered Cas9s with improved specificity were also shown to inhibit stable binding to partially-matched DNA by smFRET (64).
Distinct from Cas9, where the HNH domain plays a significant role in target cleavage, Cas12 employs a single RuvC nuclease domain to cleave both strands of DNA. Upon recognition of the PAM and subsequent binding to the target, the RuvC domain is positioned near the cleavage site, facilitating the cleavage of both the target and non-target strands. Several studies using smFRET determined that Cas12 induces cleavage in the two DNA strands in a well-defined order, beginning with the non-target strand (30, 50, 54). Through smFRET assays and ion exchange experiments, recent studies have also revealed that divalent metal ions are required to stabilize cleavage-competent conformations in Cas12a nuclease activity (65, 66). After the cleavage event, the binding to the PAM-proximal cleavage product is maintained, similar to Cas9 (50). However, after the cut and unlike Cas9, the PAM-distal cleavage product is released rapidly by Cas12a (18, 22, 50). Such a release may cause ssDNA molecules to interact with the still-active RuvC domain, possibly explaining the non-specific cleavage activity of Cas12a with ssDNA following the cleavage of a specific dsDNA target (67, 68).
Type I Cascade does not directly degrade the target DNA; rather, it predominantly recruits the Cas3 nuclease (69, 70). Several magnetic tweezers experiments showed that regardless of protospacer mutations, R-loop locking is crucial for the recruitment of Cas3 (35, 71). However, even when the R-loop was fully formed, mutations in the PAM significantly impacted Cas3 cleavage, indicating a dual signaling mechanism during target recognition. A chip-hybridized association-mapping platform (CHAMP) assay additionally suggested that the specific DNA sequence influenced the recruitment of Cas3 (72). A recent smFRET experiment has shown that Cas3 can remain associated with Cascade to cleave ssDNA through a reeling mechanism, where Cas3 and Cascade stay in tight contact while Cas3 unwinds the DNA, resulting in loops in the target strand (73). Furthermore, a study using high-speed AFM visualized the dynamics of the Cascade/Cas3 complex, showing reeling and looping, and eventual cleavage of the target DNA at the single-molecule level (74). A single-molecule study on the E. coli Type I-E complex demonstrated that Cascade also remains tightly bound (43), while a second study on the Type I-E complex from Thermobifida fusca showed that Cascade translocates along ssDNA in association with Cas3 (75). Hence, further investigation is required to understand the post-cleavage mechanism of the Type I Cascade system.
In this review, we explored the intricate mechanisms of target search, recognition, and cleavage employed by CRISPR/Cas systems, with a particular focus on Type I (Cascade), II (Cas9), and V (Cas12) nucleases, as summarized in Table 1. Utilizing single-molecule biophysical techniques, several researches have uncovered the nuanced details of these processes that were previously obscured in ensemble measurements. Single-molecule techniques provide the unique ability to observe real-time dynamic processes and transient intermediates that are often averaged out in bulk studies. Furthermore, these methods enable the detection of molecular heterogeneity and stochastic behaviors that cannot be captured through static structural analyses. Such findings have significant implications for understanding CRISPR/Cas systems and their applications in genome editing.
Despite the advantages of these approaches over bulk assays, particularly in revealing specific molecular mechanisms, significant challenges remain. One major limitation is the discrepancy between the in vitro and in vivo findings. The mechanisms observed under controlled in vitro conditions may not fully reflect the complexities of cellular environments, where factors such as DNA replication, transcription, and the presence of other proteins can influence the CRISPR/Cas system function. Therefore, it is essential to conduct more in vivo studies to better understand how CRISPR/Cas systems operate in complex cellular settings, especially in human cells, where their application in therapeutic genome editing is critical.
In addition, while structural studies of CRISPR/Cas proteins have provided valuable insights, key questions remain unanswered. For example, we still do not fully understand the factors influencing the target search mechanism of Cas9, such as its use of facilitated diffusion along DNA. Moreover, there is a lack of direct studies that visualize the movement of Cas9 as it searches for its target over a wide range of DNA. Further single-molecule investigations, especially those focused on the role of gRNAs (76), including a trans-activating CRISPR RNA (tracrRNA) (77, 78), are necessary to enhance understanding of these mechanisms. Additionally, studies employing functionally active Cas proteins, rather than nuclease-deficient forms, are essential. Likewise, more studies are needed to explore single-molecule dynamics in other CRISPR/Cas systems including RNA-targeting Types III (79) and VI (80) or noncanonical Cascade systems (81-83).
In conclusion, single-molecule approaches have significantly advanced our comprehension of CRISPR/Cas systems, revealing the intricate mechanisms of target search, recognition, and degradation at the molecular level. To capture the full scope of CRISPR/Cas activities in more complex and physiologically relevant settings, future research must integrate both in vitro and in vivo single-molecule approaches. Such studies are critical to advance CRISPR technologies and improve therapeutic and biotechnological applications in genome editing.
This research was supported by the National Research Foundation of Korea funded by the Ministry of Science and ICT (2021R1A2C1095046) and the Korea Health Technology R&D Project (RS-2023-00266133). Additionally, this research was supported by a grant from the KIST Institutional Program.
The figure was generated using BioRender.com.
The authors have no conflicting interests.
Single-molecule approaches for studying key stages in the CRISPR/Cas mechanism
Type | Protein | Stages | Single-molecule approaches |
---|---|---|---|
I | Cascade | Search | Magnetic tweezers (34, 35) smFRET (36), SPT (37) |
Recognition | Magnetic tweezers (40) , SPT (42) | ||
Degradation | Magnetic tweezers (34, 70), smFRET (72), SPT (42, 74), AFM (73) | ||
II | Cas9 | Search | smFRET (21, 23, 24, 33, 83), SPT (18, 22, 25-28), Optical tweezer (24), AFM (19) |
Recognition | Magnetic tweezers (40, 46), smFRET (18, 47, 48) | ||
Degradation | smFRET(21, 55-58, 61, 63), SPT (18) | ||
V | Cas12 | Search | smFRET (32), SPT(29, 33), Optical tweezers (30) |
Recognition | smFRET (29, 32, 49), Magnetic tweezers (52,53), Optical tweezers (50) | ||
Degradation | smFRET (29, 49, 64, 65) |
![]() |
![]() |