Expression of a gene requires perfect correlation between the message contained in the genomic DNA sequence and the resulting RNA or protein. Of course, a given genomic sequence can lead to different RNA or protein isoforms, notably through alternative splicing events. This type of event is extremely well regulated and contributes to the genetic program. On the other hand, some changes can lead to synthesis of aberrant RNA or protein products, which are generally detected by quality control mechanisms. Among these, nonsense-mediated mRNA decay (NMD) is likely the most studied (1-4). NMD is an mRNA surveillance mechanism found in all eukaryotes studied to date: animals, plants, and fungi (5-7). It acts as a premature termination codon (PTC) detector. Once an mRNA carrying a PTC is detected, it is processed by a dedicated machinery which degrades it very quickly in order to prevent its translation to a truncated protein.
NMD thus enables cells to eliminate mRNAs liable to cause synthesis of a truncated protein. Such proteins are very often nonfunctional, or they may have acquired a function that is deleterious for the cell. The efficiency of NMD is such that the level of a PTC-carrying mRNA can be reduced by 75-100%. Interestingly, even when the PTC-carrying mRNA is present at a non-zero level, it is not translated to protein: during its translation, the proteasome degrades the peptide being synthesized, notably thanks to one of the central NMD factors, the UPF1 protein (8). This protein degradation, however, is compatible with presentation at the cell surface, by the major histocompatibility complex class I, of peptides of 8 to 12 amino acids long derived from peptide synthesis during the pioneer round of translation (9, 10). This is precisely the mechanism exploited in the development of anti-cancer therapeutic approaches using NMD inhibition to induce synthesis of neo-epitopes (11, 12).
NMD appears as a formidable mechanism for degrading PTC-carrying mRNAs, but that isn’t all: certain genes have diverted this surveillance mechanism into a very powerful gene regulator (13, 14). This implies that NMD must also be tightly regulated by different mechanisms (1), notably in a tissue-specific manner (15). For example, COL10A1 mRNA carrying a PTC is strongly degraded by NMD in cartilage cells unlike in non-cartilage cells (16). The cell status also influences NMD efficiency. For example, a gene like GADD45b which has been shown to be a pro-apoptotic gene has its expression repressed by NMD when cells are alive unlike when cells move towards cell death by apoptosis (17-19). It is thus easy to understand that NMD must be activated or inhibited at specific times or locations, and this involves extremely complex machinery. Since its discovery in 1979 (5) NMD has been dissected, and although there are still gray areas, it is certainly the best-studied quality control mechanism. This review aims to take stock of the knowledge accumulated on mRNA degradation through NMD in human cells. By presenting the most convergent and established data, we hope to make this mechanism accessible, although in certain aspects it still remains quite confusing.
For an mRNA to be degraded by NMD, the first step is its recognition as a PTC-carrying mRNA. PTCs arising from a nonsense mutation, from a frameshift mutation, or from a modification of the splicing profile, are recognized during translation by the ribosome. Just to remind, a nonsense mutation is a point mutation in DNA that changes a codon specifying an amino acid into a translational stop codon. A frameshift mutation could be caused by a DNA sequence insertion or deletion in the open reading frame of a gene. The ribosome, by pausing, will trigger initiation of NMD. Fig. 1 illustrates what happens at a stop codon, according to whether it’s a physiological one or a PTC. When the ribosome arrives at a stop codon, no tRNA enters the A site. Instead, the site accepts the translation termination complex, within which the eukaryotic Release Factor 1 (eRF1) mimics the tRNA. At the physiological stop codon at the end of the open reading frame, the polyA binding protein C1 (PABPC1), bound to the polyA tail, recruits the translation termination complex to the A site of the ribosome (Fig. 1). On a PTC, PABPC1 is distant from the ribosome located on the stop codon. Recruitment of the translation termination complex occurs through an NMD factor, the UPF3X protein (also called UPF3B), and indirectly through another NMD factor, the UPF1 protein, shown to accumulate in the 3’UTR of mRNAs (20, 21). The role of UPF3X would be to slow down recognition of the PTC by eRF1 and to promote release of the nascent peptide and the ribosome (22). Delayed PTC recognition is certainly very important, and could lead to a kinetic difference between translation termination at a physiological stop codon via the PABPC1 and translation termination at a PTC via UPF proteins. After departure of the ribosome, UPF2 may join UPF3X to stimulate the 5’-to-3’ helicase activity of UPF1 (23, 24). This promotes removal of protective proteins from the downstream region of the PTC, the ribosome having already removed the proteins upstream of the PTC when reading the mRNA. The exposed mRNA then becomes a substrate for RNases as described in the next paragraph.
An mRNA can be degraded via different pathways, including degradation from either the 5’ or the 3’ end or through endonucleolytic cleavage generating free 5’ and 3’ ends for exoribonucleases (25-29). NMD involves all these degradation pathways, as if to ensure that as few as possible PTC-carrying mRNAs escape degradation (Fig. 2). Activation of the 5’-to-3’ pathway is initiated by interaction of the phosphorylated NMD factor UPF1 with the SMG5/SMG7 or SMG5/PNRC2 heterodimer (30). The involvement of PNRC2 in NMD is controversial, however this protein might simply have a general role in the decapping step, since knocking it down does not increase the level of NMD substrates (31). Both SMG5/SMG7 and SMG5/PNRC2 interact with the decapping complex, particularly with DCP1 and DCP2, so that the cap is removed and a free 5’ end, accessible to exoribonucleases such as XRN1, is generated.
The SMG5/SMG7 heterodimer has also been shown to interact with the exosome (25) and with the CCR4-NOT deadenylase complex (32). In the latter case, direct interaction has been shown between the C-terminal proline-rich region of SMG7 and the POP2 catalytic subunit of the CCR4-NOT complex. On the other hand, the involvement of the exosome has been shown by evidencing protein interactions between UPF factors and exosome subunits such as RRP4 and RRP41, and more recently by demonstrating the involvement of DIS3L2 (RRP44), a subunit of the exosome, in NMD (33, 34).
The endonucleolytic degradation pathway is induced by the SMG6 protein which, in interaction with the phosphorylated UPF1 protein, induces a cleavage in the vicinity of the PTC (28, 29). The consequence is the appearance of an unprotected 5’ end and an unprotected 3’ end, which will be targeted, respectively, by exoribonucleases such as XRN1 or by the exosome. It is impossible at the moment to exclude either the hypothesis that all three degradation pathways might be activated on the same PTC-carrying mRNA or that each pathway is activated exclusively. A recent study could answer this question since it suggests that all these degradation pathways are closely linked since the presence of the SMG5/SMG7 heterodimer is necessary for the endonucleolitic activity of SMG6 (35). Finally and to complete the description of this degradation pathway, it seems that the departure of the ribosome from the mRNA occurs later than initially thought since the ribosome is still detected in the vicinity of the PTC when endonucleolitic cleavage by the SMG6 protein take place (36).
This is certainly the most debated part of the NMD mechanism (37-40). There exist at least two competing models of NMD activation for cells of higher eukaryotes, based on published experimental data and described in the following paragraphs. These models attempt notably to explain how a stop codon is recognized as a PTC and how proteins such as UPF1 are recruited to the translation termination complex to activate NMD.
The first model relates the position of the PTC to splicing events having occurred downstream of this position. By moving the position of a stop codon on a construct coding for triosephosphate isomerase, it was possible to transform the physiological stop codon located in the last exon into a PTC, when an intron was introduced more than 50-55 nucleotides downstream of the stop codon position (41). The link between NMD and splicing has been confirmed by multiple studies and represents an autoregulatory pathway for certain genes, such as the splicing factor SRSF2 gene (SC35), which activates splicing in the 3’UTR region of is own mRNA so as to transform the physiological stop codon into a codon recognized as a PTC (42). The splicing reaction results in deposition of a protein complex called the EJC, for Exon Junction Complex. This complex is deposited 20-24 nucleotides upstream of exon-exon junctions as a consequence of a splicing event (43, 44). These are the splicing factors Complexed With Cef1 (CWC) 22 and 27 present in the spliceosome which recruit the EJC core proteins (eIF4A3, MAGOH, Y14 and MLN51) and position them upstream of the splicing event (45, 46). EJC composition evolves from the time of its deposition on the mRNA, at the end of splicing, to the moment of translation. This enables it to play several roles, and notably a role in NMD. The view that this complex is involved in NMD is based on diverse experimental data. First of all, downregulation of EJC components, notably with siRNA, has been shown to inhibit NMD. Or on the contrary, EJC components can induce accelerated degradation when they are tethered to an mRNA (47, 48). Secondly, interactions between EJC components and NMD factors have been demonstrated notably by immunoprecipitation (49-53). This recruitment of NMD factors, particularly UPF3X, by the EJC appears to occur thanks to the protein interactor of little elongation complex ELL subunit 1 (ICE1) (54).
Following recruitment of UPF3X by the EJC, the UPF2 protein joins the complex before the possible arrival of UPF1, if a PTC is detected. The molecular modalities of UPF1 recruitment are not yet fully understood. According to one study, UPF1 is recruited to the EJC as a complex, the SURF complex, with the proteins SMG1, SMG8, SMG9, eRF1, and eRF3 (55). According to another, UPF1 interacts directly with cap-located CBP80, and CBP80 then facilitates interaction between UPF1 and EJC-located UPF2 (56). To reconcile these two studies, we could imagine that CBP80 places UPF1 not only on the EJC but also on the SURF complex (Fig. 3A). This possibility is supported by the fact that the ARS2 protein, which interacts with the CBP80/20 heterodimer, also interacts with UPF1, SMG1, and eRF1, thus facilitating interaction of the SURF complex with the proteins carried by the cap (57). The fact that these studies show an interaction between the SURF complex and the CBP20/80 proteins on the cap indicates that this event takes place during the pioneer round of translation (10). Once positioned on the EJC, UPF1 is then phosphorylated by the SMG1 kinase (58, 59). Until recently, SMG1 was the only kinase known to phosphorylate UPF1, but two studies have now shown that the protein kinase AKT1 can also phosphorylate UPF1 (60, 61). It has not yet been clarified, however, whether AKT1 can replace SMG1 by acting identically or whether its intervention in NMD activation results from another cascade of interactions.
In this model, the EJC is not required to induce the NMD response. Hence, nor is any splicing event required downstream of a PTC. What comes into play is competition between the UPF proteins located downstream of a stop codon and PABPC1 attached to the polyA tail of the mRNA for recruitment of the eRF1 and eRF3 proteins, leading to translation termination. Thus, the longer the 3’UTR, the greater the number of UPF proteins bound downstream of the stop codon and the greater the probability that recruitment of the translation termination complex will be done by the UPF proteins and not by PABPC1. Moreover, PABPC1 is located furthest from the ribosome paused on the PTC and therefore has less chance to recruit the translation termination complex (21). This is a model similar to the mode of activation of NMD in other organisms than Human such as Yeast, Drosophila, and the worm C. elegans (62, 63).
This model, however, suffers from several experimental contradictions, as it has clearly been demonstrated that a splicing event in the 3’UTR of an mRNA leads to transforming a physiological stop codon into a PTC. Yet a splicing event tends to bring PABPC1 closer to the physiological stop codon, which should oppose induction of NMD. Another paradox stems from the fact that, according to this model, the closer the PTC to the translation initiation codon, the greater the efficiency of NMD, since PABPC1 is very far away and the probability that termination of translation will be induced by UPF proteins is higher (Fig. 3B). Yet no gradient in NMD efficiency has been observed, generally, in higher eukaryotic cells (41). On the other hand, the EJC-independent NMD activation model fits very well with certain experimental data that cannot be explained by the EJC-dependent model. For example, PTCs located only about fifteen nucleotides upstream of the last splicing event elicit NMD of T cell receptor (TCR) β and immunoglobulin mRNAs (64). Yet it must be remembered that these mRNAs very often carry PTCs because of rearrangement of the VDJ domains at the gene locus and that the corresponding mRNAs must absolutely be degraded so as not to induce an erroneous immune response. This is also almost certainly the reason why NMD of these mRNAs appears much more efficient than that of mRNAs from genes not involved in the immune response (65). Everything thus seems to indicate that the mRNAs coding for the T-cell receptor or for immunoglobulins belong to a particular category of mRNAs requiring such high-efficiency NMD that the process can also be activated via an EJC-independent mechanism. Although sequences bordering introns in TCR RNA or sequences in the V segment in immunoglobulin RNA have been identified (65, 66), the molecular mechanism inducing this greater NMD efficiency is still relatively poorly understood. Overall, the EJC-independent mechanism of NMD activation has been less studied than the EJC-dependent model. A distinguishing feature of EJC-independent NMD activation is that it can occur during any translation round, not just the first. It has indeed been shown that mRNAs whose cap is bound by the eIF4E protein can be subject to NMD (67, 68).
The steps following UPF1 recruitment are a priori similar in the two models. An important question is why the cell has evolved two pathways of NMD activation. At present, everything suggests that these two pathways are complementary and on the basis of experimental data, we cannot exclude that EJC-independent NMD activation might be limited to certain specific mRNAs or be initiated after EJC-dependent NMD.
NMD is certainly the most studied of all the quality control mechanisms taking place during gene expression. Inevitably it is also the most debated, given the mass of knowledge accumulated on NMD and the great diversity of models used to study it. Generally speaking, everyone agrees that NMD is not only an mRNA surveillance mechanism that rapidly detects and degrades mRNAs carrying a PTC, but also a gene regulation pathway. The estimated proportion of genes, particularly human, using NMD to regulate their expression ranges from 5 to 10% (13). One should remember, however, that many of these genes might be only indirectly regulated by NMD. In addition, this estimate was based on the use of siRNA, which can possibly lead to off-target effects. A small-scale study has led to the conclusion that this percentage could be an overestimate (69).
One of the greatest complexities of NMD lies in the number of different proteins involved in the mechanism, whether for PTC recognition or for mRNA degradation. Factors thought to be central may appear non-essential, at least for some NMD reactions. For example, it has been clearly shown that whether certain NMD factors, such as UPF3X and UPF2, are necessary or not depends on the composition of the EJC (70). Certain EJC proteins are likewise thought to be required in some but not all NMD reactions. For example, the protein MLN51/CASC3, shown to belong to the core of the EJC (71, 72), might not constitutively be a component of this complex (73, 74). The EJC itself has been questioned, particularly in the EJC-independent model of NMD activation: it might simply act as an activator of NMD but not be absolutely necessary for PTC detection. All this information suggests that NMD is a very flexible and certainly evolving process, mediated by protein complexes whose composition varies over time, according to kinetics that has not yet been completely established (73). This variability of the protein composition of NMD complexes might also reflect modes of regulation that need to be further investigated. For example, SMG1 has long been presented as the only kinase to phosphorylate the UPF1 factor, but very recently a second has been discovered: the protein kinase AKT1. This opens new possibilities for regulation, particularly in cancer cells, where AKT1 is very often overexpressed (60, 61). Clearly the NMD mRNA surveillance mechanism, although theoretically simple in its roles and activation, appears much more complex in its mode of operation. How it interacts with the various cellular metabolic pathways, remains to be studied in much more detail. Such data will certainly make it possible to clarify the parameters necessary for NMD activation and the functional specificities linked to the composition of the protein complexes involved.
NMD appears to be a central player in numerous biological processes and in the development of pathologies due to its involvement in the elimination of mutant mRNAs carrying PTCs and in the regulation of numerous genes. Concerning the biological processes in which NMD plays an essential role, we can first mention embryonic development since the absence of expression of the UPF1 gene leads to the death of the embryo at 3.5 days p.c. (75). The central nervous system seems all particularly dependent on NMD since neurological disorders are observed in patients in which the NMD factors UPF3X, UPF3, UPF2, SMG6, RNPS1 or eIF4A3 are mutated (76). The differentiation of myoblasts into myotubes has been shown to require inhibition of NMD in order to allow the expression of myogenin (77). The involvement of NMD in many other mechanisms has been described in recent reviews (1, 3). In a pathological context, NMD can also play a determining role either by preventing the pathology or, on the contrary, by inducing it. In fact, around 10% of cases of genetic diseases are linked to the presence of a nonsense mutation (78). The consequence of this mutation is the absence of gene expression due to NMD. However, depending on the position of the PTC in the reading frame, some truncated proteins if synthesized in the absence of NMD could partially or completely retain the function of the wild-type protein. For example, in Duchenne muscular dystrophy, all nonsense mutations located from exon 71 could lead to a functional dystrophin (79). Inhibition of NMD also represents an interesting anti-cancer therapeutic approach, particularly for inducing the expression of neo-antigens on the surface of tumor cells (11, 12, 80). Finally, inhibiting NMD could make PTC readthrough more effective by increasing the quantity of substrate mRNA for readthrough (81, 82). Therefore, although so far no NMD inhibitor has reached the clinical trial phase, NMD inhibition could represent a future therapeutic development.
JC is supported by Inca. FL is supported by funding from ANR, Inca, La Ligue contre le cancer, Vaincre la mucoviscidose and the Association Française contre les Myopathies. The Canther Laboratory is part of the ONCOLille institute. This work is supported by a grant from Contrat de Plan Etat-Région CPER Cancer 2015-2020.
The authors have no conflicting interests.