The recent development of target-specific genome editing technology has enabled various applications in diverse biological systems. Among the genome editing tools available, the CRISPR system has heavily contributed to improving the efficiency and accuracy of genome editing (1-3). The CRISPR modules, which were originally identified as the immune systems of bacteria and archaea, have been widely applied as genome editing tools in mammalian systems as well as plants and microorganisms (4). The natural CRISPR molecules composed of CRISPR effector proteins and guide RNAs induce double-strand breaks (DSBs) in DNA at specific sites where the target DNA sequences are recognized by base complementarity to the guide RNA (5). Based on a mechanistic perspective, double helix cleavages by the CRISPR modules induce efficient genome editing in concert with intracellular operating DNA repair systems including non-homologous end joining (NHEJ), homology-directed repair (HDR) and microhomology-mediated end joining (MMEJ) (6). The mechanisms of the intracellular repair systems are distinct, and the systems function competitively upon the incidences of DNA breaks. Therefore, the results of DNA repair could differ depending on which repair systems were utilized: NHEJ is a non-templated DNA joining process and could introduce insertions and deletions, HDR is template-dependent and uses endogenous or exogenous DNA with homology, MMEJ uses the short homologous sequences present near the DSB points to join the DNA. The CRISPR genome editing tools that function by precisely cutting the target DNA are developed in concert with the intracellular DNA repair systems to demonstrate high genome editing efficiencies in several biological systems.
Nonetheless, the accuracy and fidelity of the DSB-mediated CRISPR genome editing methods may be suboptimal for specific applications such as developing gene therapies for human diseases. A major challenge is the off-target events that introduce unintended mutations at loci where the DNA sequences are partially complementary to the guide RNAs (7). Another concern is the variability of alterations in DNA sequences at the target sites following CRISPR genome editing (8-11). The non-uniform DNA sequences at the edited loci are caused at least in part by simultaneous action of several distinct intracellular repair systems that induce heterogenous insertions and deletions (12). Therefore, DSB-based CRISPR genome editing methods were particularly inappropriate for applications that require precise substitution of select single bases, such as correcting pathogenic single nucleotide variations (SNVs) (13). To address the problem, several methods were developed to achieve the desired genome correction via a scarless HDR pathway (6, 14-17). However, these methods still relied on DSB and therefore the approaches were unable to effectively eliminate the intrinsic challenges associated with the repair of DNA DSB (18).
In an effort to overcome the indel-by-DSB issues, recent advances in CRISPR demonstrated that precise genome editing could be conducted without DSB (19-21). Among the approaches, base editing methods adopted utilizing base-modifying enzymes in combination with CRISPR systems to substitute single bases at the target sites (19, 20). These methods enabled targeted DNA base substitution at a defined window, generally less than 10 bases, and the DNA base changes such as cytosine to thymine or adenine to guanine were controlled. The windows of base substitutions could be deliberately widened or constricted, depending on the purpose of the base editing. Another approach, called prime editing, combined CRISPR with RNA-dependent DNA polymerase to overcome the limitations of single-base substitutions by base editing (21). In this brief review, we introduce some of the recent developments and applications of DSB-free CRISPR genome editing methods that enable genome editing at single base level with enhanced accuracy.
The original form of CRISPR-based genome editing method induces double strand DNA breaks (DSB) as an initiation step. The DSB activates the intracellular DNA repair systems and the intended genome editing occurs during the process (Fig. 1A). The DSB repair inherently involves multiple pathways, such as NHEJ and HDR, and results in stochastic variations of repaired DNA sequences. As an alternative approach, a ‘base editing’ method attempted DSB-free genome editing via targeted DNA changes using a cytidine deaminase to modify cytidine to uridine (19). In the base editing method (BE1), a fusion construct that consists of
Theoretically, even in the absence of BER, the maximum efficiencies of base editing via G:U intermediate were limited to 50% as both strands can be used as templates for DNA replication. Nonetheless, a higher conversion rate could be achieved by deliberately introducing single-strand DNA breaks in the non-edited DNA strand containing the guanine base of the G:U wobble pair. The single-strand break activated mismatch repair (MMR) that actively removed the unedited guanines as they were recognized as damaged DNA. Based on the approach, the third-generation base editor (BE3) was prepared via fusion of Cas9 nickase (D10A), APOBEC1 cytidine deaminase and UGI. Accordingly, in base editing via BE3, the wobble G:U pairs are preferentially resolved to A:U (A:T) products to yield 2-6-fold and 6-18-fold higher efficiencies compared to BE2 and BE1 respectively. The highly efficient base editing by BE3 was, however, accompanied by dual DNA nicking that could induce rare but detectable undesired indel mutations. Nevertheless, the observed indel rates of BE3 were significantly lower than that of conventional CRISPR genome editing.
Analyses of base editing data showed that the undesired by-products of UDG were more prominent in target DNA sequences carrying single cytidines within the windows of base editing (22). As an effort to increase the base editing efficien-cies, other cytidine deaminases were utilized: cytidine deaminase 1 (CDA1) to generate CDA-BE3, activation-induced cytidine deaminase (AID) to generate AID-BE3, and apolipoprotein B mRNA editing enzyme catalytic subunit 3G (APOBEC3G) to generate APOBEC3G-BE3 (22). Among the variants, CDA-BE3 and AID-BE3 showed higher editing efficiencies compared to BE3 at specific targets containing “GC” sequences. Based on the analyses of BE3 variants, an enhanced version of BE3, called BE4, was prepared by modulating the length of the linkers between APOBEC1, Cas9 nickase and UGI, and incorporating an additional UGI. Following the optimization, BE4 showed ∼1.5-fold increase in base editing efficiencies compared to BE3 (up to 27-fold compared to BE1). BE4 also showed ∼2-fold decrease in formation of undesired non-T products. The strategy could also be applied for enhanced base editing using
The efficiency and applicability of the cytosine base editing methods was further enhanced by changing the enzyme modules, optimizing the codon usage, and modifying the nuclear localization signal sequences (23, 24). Ancestral reconstruction of the deaminase component of BE4max, an engineered base editor, resulted in AncBE4max with highly efficient base editing even with the delivery of significantly reduced levels of base editor plasmids (24). Interestingly, an altered base editing method that converts cytosine to guanine, instead of thymine, was also developed (25). The targeted C-to-G substitution was accomplished with a fusion construct composed of Cas9 nickase, a uracil DNA N-glycosylase derived from
The cytosine base editors provide precise editing to convert C to T and G to A, but the method is not suitable for base convertsion in the reverse direction. In order to address the issue, an adenine base editing (ABE) method was developed to enable conversion of A to G and T to C (Fig. 1B) (20). The first-generation ABE (ABE1.2) was generated by fusing CRISPR-Cas9 nic-kase proteins with an engineered variant of
The windows of base conversion by ABEs were generally 4-6 nucleotides wide, similar to cytosine base editing methods. For example, ABE7.10 showed high activities at sgRNA positions 4 to 7 (20). Notably, adenine base editing at positions upstream of the typical 4-6 nucleotide windows could be facilitated by utilizing longer sgRNAs (28). The off-target effects of ABE7.10 were significantly lower at DNA levels compared to both CRISPR nucleases and cytosine base editors (20, 29). However, analyses of RNA modification showed that ABEmax induced low but detectable adenine-to-inosine conversions in mRNA (30). Both native TadA and TadA* components in ABEmax contributed to the transcriptome-wide A-to-I RNA conversions. Analyzing the protein variants of TadA and TadA* resulted in ABEmaxAW with two point mutations (TadA E59A, TadA* V106W) that showed substantially suppressed the RNA editing, almost comparable to background level detected when Cas9 nickase alone was applied.
The target sequences of the base editor methods were con-strained by the PAM sequence recognized by the CRISPR system. For instance, BE3 is a base editor system using SpCas9, and requires “NGG” PAM sequences adjacent to the 3’ end of the target site. In order to overcome the PAM sequence limitations, various natural and engineered Cas9 variants were used to expand the repertoire of target sequences (31, 32).
While base editors enable genome editing without random indels, erroneous C-to-T conversions at off-target site still remain as potential safety concerns in biological and medical applications. Studies have adopted different approaches to address this issue. Applying a high-fidelity CRISPR-Cas9 (HF-Cas9) (36) to BE3 reduced off-target effects (37). HF-Cas9 is an engineered variant of SpCas9 containing four point mutations (N497A, R661A, Q695A, and Q926A) that result in decreased non-specific interactions with the phosphate backbone of DNA target strand. Base editing using HF-Cas9 (HF-BE3) demonstrated a markedly decreased off-target base editing activity in human cells. Off-target effects are further reduced by ribonucleoprotein (RNP) delivery (37). RNP delivery of BE3 and HF-BE3 resulted in editing efficiencies comparable to conventional plasmid delivery. Notably, higher on-target editing efficiencies in base editing of human cells via plasmid delivery were generally accompanied by increased off-target editing. However, base editing via RNP delivery led to efficient on-target editing without detectable off-target effects, similar to a previous study (38). Such decoupling of the linear relationship between on- and off-target editing rates facilitates RNP delivery base editing for enhanced specificity.
Precision of base editor could also be enhanced by modifying the cytidine deaminase (31, 39). In BE3, a five-base window exists, which increases the likelihood of substitution in the included cytidines. Occasionally, undesired substitutions may occur in nearby cytidines. The editing window could be modulated by inducing mutations in APOBEC1 that are involved in substrate binding (31). Combining three amino acid mutations in APOBEC1 resulted in a base editor that induced C-to-T conversion in a window of 1-2 nucleotides. Applying a human cytidine deaminase enzyme (APOBEC3A) also generates a cytosine base editor with reduced bystander and off-target activities (39). In the study, an engineered human APOBEC3A that characteristically recognizes a “TC” motif enabled a 40-fold increase in the specificity of cytidine substitution.
Diverse applications of base editing were demonstrated in various biological systems (Table 1). Corrections of pathogenic single-base substitutionsin mammalian cells could be conducted via cytosine base editing (BE3) (19). In the study, two point mutations in APOE4 gene (C158R), associated with late-onset Alzheimer’s disease, were corrected in mouse astrocytes with efficiencies up to 74.9%. The BE3 method was also used to correct an oncogenic point mutation in TP53 gene (Y163C) in human breast cancer cells with a rate of 7.6%. The frequeny of indels using BE3 was significantly lower than that of conventional CRISPR genome editing mediated by DSB. Using mouse models, base editing of post-mitotic cells was achieved via
In addition to cytosine base editors, adenine base editors were also effectively used in several organisms. As a demonstration of a potential therapeutic approach, ABE7.10 was applied to human cells to install T > C base corrections into the promoters of HBG1 and HBG2 genes that encode fetal hemoglobin (20). The T-to-C point mutations were clinically reported to induce a benign condition called hereditary persistence of fetal hemoglobin (HPFH) that confers resistance to specific beta-globin related diseases. ABE7.10 is also utilized to correct a pathogenic point mutation associated with hereditary haemochromatosis (HHC), a genetic disorderrelated to iron storage (20). In HHC, a G-to-A mutation causes C282Y mutation in human HFE gene, which in turn results in a serious condition via excess iron absorption. Application of ABE7.10 to immortalized lymphoblastoid cell line resulted in correction of pathogenic tyrosine at position 282 to cysteine with a rate of 28%. Delivery of ABE via AAV was used to generate albino mice and in therapy for Duchenne muscular dystrophy (DMD) (28). In this study, a 2-vector split AAV delivery method of ABE efficiently corrected a pathogenic premature stop codon in a DMD mouse model. ABE was also used in plant genome editing (43). A rice genome editing system (ABE-P1) utilized a previously reported 32-amino-acid linker (20), and a VirD2 nuclear localization signal. The protein component of the ABE-P1 system was expressed in rice via maize ubiquitin promoter, and the sgRNA was produced using rice U6 promoter. By introducing the ABE-P1 into rice via agrobacterium-mediated transformation, transgenic lines were generated with efficiencies up to 26%.
Screening applications were developed using alternative base editing methods to generate diverse libraries by deliberately installing near-random base substitutions within the target windows (44, 45). These methods utilized activation-induced cytidine deaminase (AID) enzymes to induce base substitutions at the target sites with only little bias towards C-to-T and G-to-A. A targeted AID-mediated mutagenesis (TAM) method used a fusion construct of dCas9 and human activation-induced cytidine deaminase (AID) involved in somatic hypermutation (44). Another method called CRISPR-X utilized an engineered and truncated variant of AID protein (AID*Δ) fused to MS2 proteins (45). In this method, the AID*Δ proteins were localized to the target loci by the fused MS2 protein recognizing the MS2 RNA hairpins, which were inserted into the sgRNA sequences. The CRISPR-X method demonstrated induction of near-random DNA substitutions within a wide window of −50 to +50 bp positions relative to PAM.
While advances in base editing provide a wide repertoire of single-base editing techniques, the base editors are capable of changing bases in a predetermined direction. Therefore, application of base editors are unsuitable for mutagenesis of multiple bases within defined sequences that requires concomitant combinations of C-to-T and A-to-G conversions. To address the issue, an alternative DSB-free editing method, known as prime editing, was developed (21, 46-48). Although prime editing is similar to base editing in that no DSBs are involved, a distinct molecular mechanism is involved (Fig. 1C). Prime editing methods utilize fusion constructs that are composed of reverse transcriptases (RT) and Cas9 nickase proteins. In prime editing, elongated guide RNAs called prime editing guide RNAs (pegRNA) play a dual role as both sgRNAs for target sequence recognition and RNA template for reverse transcription by RT. In terms of achieving desired DNA sequences beyond single bases, prime editing scores over base editing in that transversion changes (A to C, T or G to C, T) in DNA sequences can be induced by designing the pegRNA sequences (21).
The process of prime editing occurs in three steps including DNA nick, DNA polymerization, and repair (21). First, the Cas9nickase (H840A) within the prime editing fusion protein recognizes the target DNA and introduce a single-strand break at the non-target DNA strand at the designated locus. Next, the 3’ end of the prime editing guide RNA (pegRNA) containing ∼13 nucleotides, with sequence complementarity to the nicked DNA strand, invades the target DNA and forms RNA-DNA heteroduplex. The RNA-DNA hybrid then serves as a template for DNA polymerization via reverse transcriptase derived from Moloney murine leukemia virus (M-MLV). The extended DNA fragment contains the desired mutant sequences that were designed in the RT template region of pegRNA. Finally, the DNA repair process incorporates the 3’-end DNA flaps with the desired mutant sequences into the genomic DNA.
The original prime editing scheme, called PE1, enabled precise genome editing with moderate efficiency. In PE2, modifying a number of amino acid residues within the reverse transcriptase increased the genome editing efficiency. The efficiencies of prime editing are further increased in PE3 or PE3b versions by installing additional nicks at the non-edited DNA strand near the prime editing target sites. The nicks at the non-edited strand facilitate the intracellular DNA repair system (base excision repair) to preferentially incorporate the newly synthesized mutant DNA flaps into the genomic DNA. As a result, prime editing enables multiple-base mutations as continuous stretches of DNA sequences as the polymerase-based method incorporates consecutive DNA bases, and the range is not restricted by the editing windows.
Application of prime editing was demonstrated in several biological systems. In human cells, prime editing facilitated conversion of multiple consecutive bases in genomic DNA (21). Primer editing of plant systems (PPE) was demonstrated in rice and wheat (46). The delivery of primer editing molecules in mRNA forms was also shown in human iPS cells (47) and in mouse embryos (48).
Recent DSB-free genome editing methods have improved the accuracy of genome editing compared to conventional techniques. Base editing methods utilize novel approaches and open new possibilities via precise base-by-base corrections. Precise single-base genome editing is particularly useful in addressing point mutations that are associated with phenotypic outcomes. Along with advances in base editing technologies, improved tools for bioinformatic analyses are also being developed (49). Prime editing provides precise and versatile genome editing tools with virtually no constraints in inducing the desired sequence changes: single or multiple base substitutions and defined indels with low rates of NHEJ-mediated random mutations. Analyses showed that prime editing resulted in efficient genome editing outcomes with low rates of unintended indels at the on-target and off-target loci. Notably, prime editing has been shown to result in successful genome editing of relatively short stretches of DNAs, measuring less than 100 bp. In some cases, such as transgene insertions, DNA fragments of several thousand bps are required. Currently, inserting large DNAs have been often conducted by delivering donor DNA and inducing double strand DNA to incorporate the donor DNA into genomic DNA via HDR. However, the DNA insertion by HDR is somewhat less efficient and is prone to unanticipated mutations at the DNA cleavage sites. Hence, it would be of interest to assess whether the efficiencies and precision of prime editing for large DNA with that of conventional HDR-mediated genome editing. In summary, the advances in base-level CRISPR technologies have facilitated unprecedented accuracy and freedom of genome editing that are anticipated to widen the scope of applications in biology and medicine.
This study was supported by grants from the National Research Foundation funded by the Korean Ministry of Education, Science and Technology (NRF-2019R1C1C1006603, NRF2017R1E1A1A01074529, NRF-2018M3A9H3021707, NRF-2019M3A9H110 3783, and 2020R1I1A2075393), the Technology Innovation Program funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea) (20009707), and KRIBB Research Initiative Program (KGM5382113, KGM1052021, KGM4252122).
The authors have no conflicting interests.
A list of applications of base editing and prime editing
|Gene||Target sequence||Base changes||Editor Type||Delivery method||Target organism||Efficiency (%)||Ref|
||C to T||BE3||Plasmid delivery||Mouse||74.9||19|
||PBE (APOBEC1- XTEN-n/dCas9- UGI)||Agrobacterium-mediated transformation||Plant (rice)||1.61-8.35||42|
||A to G||ABE7.10||Plasmid delivery||Human||29.4-30.1||20|
||T to A||PE3||Plasmid delivery||Human||26-52||21|
||4bp deletion||PE3b||Plasmid delivery||Human||33|
||G to A||PE3||Plasmid delivery||Human||53|
||GA to CC||PPE3||PEG-mediated transfection||Plant (rice)||< 0.5||46|
||G to T||PPE3||Plant (wheat)||1.8|
||G to A||PPE3b||2-3|
||T to A||PPE3||2-3|
||AC to GG||PE3||RNA delivery||Human (AAVS1-eGFP hiPS cells)||7.5||47|
||G to C/T||PE3||mRNA microinjection||Mouse embryo||1.1-18.5||48|
The table shows the target genes, target sequence with desired gene editing, and utilized genome editing technique. Also shown are the additional information such as delivery method, model systems, and the efficiencies. The list includes applications of base editor (BE), plant base editor (PBE), prime editor (PE) and plant prime editor (PPE).