Colorectal cancer (CRC) is a significant global health concern with incidence and mortality rates of 10.2% and 9.4%, respectively, in 2020 (1, 2). In Korea, CRC is the third most common cancer and poses a significant health challenge (3). CRC develops through a multistep process that involves the accumulation of various genetic and epigenetic alterations, among which aberrant DNA methylation is a key mechanism (4).
DNA methylation primarily occurs at cytosine-phosphate-guanine (CpG) sites and regulates multiple processes, such as gene expression and genome stability. Aberrations in DNA methylation patterns, including hyper- and hypomethylation, are associated with various diseases, including cancer (5). Aberrant DNA methylation has been associated with CRC onset and progression. Hypermethylation of CpG promoters in cancer cells can lead to tumor suppressor gene silencing, inducing mechanisms of carcinogenesis (6, 7). Hypomethylation activates oncogenes and causes genome instability (8). Notably, a subset of CRCs exhibits a distinct feature known as the high-CpG island methylator phenotype (CIMP-H). CIMP-H cancers are characterized by widespread hypermethylation of CpG islands, particularly in the promoter regions, leading to gene inactivation (9, 10); this is linked with distinct clinical and molecular characteristics. These result in a heightened incidence of tumors in the proximal colon, a higher frequency of BRAF gene mutations, and the presence of microsatellite instability (11-13).
Despite efforts made to improve the understanding of the relationship between DNA methylation and CRC, comprehensive clinical and methylome data specific to Korean patients with CRC remains elusive. This information is essential for the identification and validation of CRC-related methylation markers that could enhance disease diagnosis and prognosis. Therefore, this study aimed to construct a comprehensive CRC methylome profile from a large cohort of Korean patients with CRC (n = 172). We determined the methylome distribution and its association with various clinical characteristics. Additionally, we compared our results with those of previous CRC studies to ensure data validity and reliability. Furthermore, we aimed to deepen our understanding of the role of DNA methylation in CRC etiology, prognosis, and treatment of the Korean population.
The study consisted of 172 tumor and 128 adjacent normal samples obtained from 172 individuals diagnosed with colorectal adenocarcinoma at the Asan Medical Center (Table 1). In this cohort, 74 were female and 98 were male, with a median age of 59 years (interquartile range: 53-68 years). The cancer was located in the right colon (cecal-splenic flexure of the transverse colon) in 49 patients (28.5%), left colon (splenic flexure of the transverse colon-distal sigmoid colon) in 68 patients (39.5%), and rectum in 55 patients (32.0%). The majority of tumors were T3 (128 patients, 74.4%), N0 (89 patients, 51.6%), M0 (150 patients, 87.2%), fungating in shape (Bormann type I or II, 137 patients, 79.7%), and favorably differentiated (well- or moderately-differentiated, 158 patients, 91.9%). Lymphovascular and perineural invasion were found in 74 patients (43%) and 56 patients (32.6%), respectively. One-hundred and twenty-three patients (71.5%) received chemotherapy after surgery, consisting of single-agent treatment with 5-fluorouracil (5-FU) or capecitabine (n = 39), a combination of 5-FU (or capecitabine) and oxaliplatin (n = 69), or targeted agents with 5-FU and oxaliplatin or irinotecan (n = 15). Twenty-two patients (12.8%) received pre- or post-operative radiotherapy (50.4 Gy in 28 fractions).
To achieve a robust methylome profile, preprocessing tasks were performed using the well-established and effective minfi pipeline (14) followed by the batch correction with combat (Fig. 1 and Supplementary Figs. 1-3; details in Supplementary Results and Supplementary Methods) (15). From the 610,674 processed methylation probes, we initially observed a distinct separation between tumor and normal samples as seen in the dimensionality reduction plot (Fig. 2A), illustrated by the principal component distribution (explained variance of PC1: 30.42%). Next, we investigated the grand mean methylation levels of the tumor and normal groups for each individual sample. The overall methylation levels were slightly higher in the normal sample group (tumor: 0.5674, normal: 0.5855 [P < 0.0001]; Fig. 2B and Supplementary Fig. 4A). We subsequently identified 19,156 differentially methylated positions (DMPs) between tumor and normal samples (Supplementary Tables 1 and 2). There were a greater number of hypomethylated probes (14,011 probes) than hypermethylated probes (5,145 probes) (Fig. 2C, D). Among the hypermethylated probes, 4,250 (82.6%) were located in the genic region (promoter: 2,956, gene body: 1,645). Among the hypomethylated probes, 7,263 (51.8%) were located in the genic region (promoter: 2,428, gene body: 5,154) (Fig. 2C and Supplementary Fig. 4B). To compare the frequency of detailed regional DMPs, we calculated the odds ratio (OR) for the enrichment of hyper- and hypomethylated probes for various genomic annotations, including gene promoter regions, body regions, and islands or shores. In promoter-like regions, TSS1500, TSS200, 5’ untranslated region (UTR), and first exon, the odds ratios of hypermethylated probes were 1.35, 3.90, 2.36, and 4.43, respectively (Fig. 2E, left; P < 0.0001). Similarly, hypermethylated probes were highly enriched in the CpG island region (Fig. 2E, right), with an odds ratio of 15.98 (P < 0.0001). In contrast, hypomethylated probes in tumor samples were predominantly located in open-sea regions (odds ratio: 4.69) (Fig. 2E, right), which are considerably distant from the CpG island regions. We further validated the methylation patterns through comparison with colon adenocarcinoma (COAD) and rectum adenocarcinoma (READ) from The Cancer Genome Atlas (TCGA) (Supplementary Fig. 5) as described in Supplementary Results.
To assess the prevalence of CpG island methylator phenotype (CIMP), we used a CIMP probe set (4,322 probes) based on 258 previously identified CIMP gene markers. From the 4,322 probes, we selected 1,220 highly variable sites (standard deviation > 0.15). Using the K-means algorithm (100 iterations) with the methylation levels of CIMP marker probes, we divided the tumor samples into three clusters and categorized each group as CIMP-high (CIMP-H), CIMP-low (CIMP-L), or non-CIMP, based on the respective mean methylation expression of each cluster (Fig. 3A, B). Based on the predetermined criteria, we identified 52 (30.2%) CIMP-H, 83 (48.2%) CIMP-L, and 37 (21.5%) non-CIMP tumor patients, and their mean CIMP marker probe methylation levels exhibited significant pairwise differences (analysis of variance [ANOVA] with a post-hoc test, P < 0.001) (Fig. 3B). Focusing on microsatellite instability (MSI) status, within the CIMP-H group, 12 (24%) patients displayed high MSI (MSI-H) status (Fig. 3C). In contrast, only 5% of the patients in both the CIMP-L and non-CIMP groups exhibited MSI-H status. This result showed the significant enrichment of MSI-H patients within the CIMP-H group (χ2 test, P < 0.01). Moreover, a significantly higher prevalence of MLH1 silencing was observed in the CIMP-H group as compared to the CIMP-L and non-CIMP groups (χ2 test, P < 0.05) (Fig. 3D). CIMP status was also correlated with the anatomical location of the large intestine and patient age. In the CIMP-H group, 22 tumors (42%) were found in the right colon and 15 tumors each in both the left and rectum were discovered. In the CIMP-L and non-CIMP groups, 18 (22%) and 9 (24%) tumors were found in the right colon, respectively. These findings revealed a higher enrichment of right colon tumor samples in the CIMP-H group (χ2 test, P < 0.01). When comparing age distribution with CIMP status, the mean age of CIMP-H patients was 61.8 years and that of CIMP-L was 60.8 years, slightly higher than that of the non-CIMP (mean age: 55.8 years) group (post-hoc test [CIMP-H vs. non-CIMP], P < 0.05) (Supplementary Fig. 6A). Following this, we explored the correlation between various CRC characteristics and CIMP status. Our results revealed no significant associations between sex, tumor size, preoperative carcinoembryonic antigen (CEA) level, Bormann type, as well as T- and M-stages with CIMP status. Within the non-CIMP groups, a larger proportion exhibited perineural invasion (χ2 test, p = 0.08) and lymphovascular invasion, including instances of N2 stage (χ2 test, P < 0.05) (Supplementary Fig. 6). The differences in both overall and disease-free survival according to CIMP status were not statistically significant. However, we did observe a slight trend toward poorer prognosis in CIMP-H compared to the non-CIMP group (Supplementary Fig. 7).
Here, we provide the DNA methylation profiles of a large cohort of Korean patients with CRC in conjunction with additional cohort collected from another Korean hospital (16), offering valuable insights into the significance of aberrant DNA methylation in CRC. The 172 Korean CRC methylome profiles of matched 128 normal samples confirmed and extended prior findings. First, we compared the overall methylation patterns of tumor and normal samples. We found that hypermethylation in CRC predominantly occurs in the CpG islands and promoter regions. These findings are consistent with those of previous studies showing that DNA hypermethylation in promoter CpG islands leads to tumor suppressor gene silencing, thereby contributing to carcinogenesis (6, 7). Next, we carried out a comparative analysis of the Korean colorectal cancer (CRC) methylome against the methylomes of CRC cases in other ethnic groups observed in previously published studies. Our investigation involved the assessment of overlap with a substantial pool of 450 K probes from TCGA CRC methylome. We observed a significant correlation between the overlapped probes from TCGA and our Korean CRC dataset. This correlation was particularly strong among the DMPs identified in this study. To further validate our findings, we took into account the previously identified diagnostic methylation markers for CRC, which comprised 15 genes (17-23). We found significant differences in the promoter methylation of 13 out of these 15 marker genes between tumor and normal samples in our Korean cohort, suggesting the necessity of identifying more specific Korean methylation markers with consideration of various CRC subtypes.
Recent studies have investigated the role of CIMP in various cancer types including CRC. In CRC, CIMP has been linked with several different molecular alterations, including BRAF mutations and microsatellite instability (24). CRC tumors with CIMP also exhibit distinct clinical features and have been associated with specific locations in the colon (e.g., proximal colon) and with specific histological features. Similarly, in gastric cancer, the CIMP phenotype is associated with MSI-H with MLH1 silencing, and is significantly associated with the presence of Helicobacter pylori and Epstein-Barr virus infection (25, 26). In glioblastoma (GBM), CIMP has been associated with the IDH1 mutation and is observed in a subtype of tumors with generally better prognoses (27, 28). The CIMP-positive GBM is associated with a distinct set of methylated genes, and has a different clinical and molecular profile compared to CIMP-negative GBM. It is important to note that while CIMP status has implications for prognosis and potentially for treatment, further research is required to fully understand the role of CIMP in these different types of cancers. In our Korean CRC cohort, the stratification of patients with CRC based on CIMP status reaffirmed the correlations between CIMP status and clinicopathological characteristics (10, 11, 13). CIMP-high tumors were significantly associated with MSI-H status, right-sided tumors, and older age. Furthermore, we investigated various cancer characteristics according to CIMP status and additionally confirmed that CIMP groups had more MLH1 epigenetically silenced samples, and non-CIMP groups had more perineural invasion (PNI)- and lymphovascular invasion (LVI)-positive samples. CIMP-positive tumors are often associated with methylation of the MLH1 gene promoter, which leads to the silencing of this important DNA mismatch repair (MMR) gene (29, 30). This is a well-established characteristic of a subset of CRCs, leading to MSI-H and a distinct clinical and pathological profile. However, the association between CIMP status and PNI or LVI is not as well-established and may vary across studies (31-34). Given that PNI and LVI are typically associated with more advanced and aggressive disease (35) and may indicate a more complex cancer status, their relationship with CIMP status warrants further investigation including more comprehensive molecular analysis. We noted that our rich CRC methylome profile improves understanding of CIMP-high CRC as a distinct molecular subtype with specific clinical and molecular features. Through developments of efficient and accessible clinical assays such as high-resolution melting curve analysis (36), we believe our discovery has great potential to guide treatment strategies. Overall, this study broadens our current knowledge related to the CIMP-high CRC subtype which allows for improved patient stratification.
Despite these notable findings, this study has several limitations. Although we included a large methylation cohort of Korean patients with CRC, further studies involving other ethnic groups are necessary for the generalization or Korean-specific characterization of the results with multi-omics profiles. Additionally, our CRC methylome focused on investigating the overall Korean CRC landscape from bulk tissue analysis, potentially leading to an oversimplified interpretation of the complex tumor system. In an ideal scenario, single-cell analysis would indeed be a compelling approach, as it could allow us to unravel the complex heterogeneity seen within tumors, delineating specific methylation patterns associated with different cell types. Moreover, treatment strategies vary greatly between patients based on their individual health status, disease stage, and other clinical factors. Examination of post-treatment methylation profiles could be informative, but such analysis would require a comprehensive and longitudinally collected dataset to reflect temporal changes across various treatments.
In conclusion, the current study provides a DNA methylation landscape of CRC in the Korean population. Our CRC methylome profile revealed hyper- and hypo-methylation events in specific genomic regions, emphasizing the vital role of aberrant DNA methylation in the development and progression of CRC. Furthermore, by stratifying patients based on their CIMP status, we note the potential of DNA methylation as an epigenetic hallmark for patient classification and personalized treatment strategies. Our findings highlight the need for further investigations into CRC management and pave the way for the development of more effective diagnosis, prognosis, and therapeutic interventions.
This study was approved by the Asan Medical Center Institutional Review Board (2017-1350) and the Yonsei University Institutional Review Board (approval number: 7001988-201910-BR-727-02), in accordance with the Declaration of Helsinki.
The detailed materials and methods on EPIC array generation and statistical analyses are provided in Supplementary methods.
This research was supported by the Bio & Medical Technology Development Program of the National Research Foundation (NRF) funded by the Ministry of Science & ICT (grant number: NRF-2017M3A9A7050614). It was additionally supported by a grant from the National Research Foundation of Korea (NRF-2020M3A9I6A01036057).
The authors have no conflicting interests.
Clinical characteristics of the study participants
|Variables||No. of patients (%) or values as median (IQR) (n = 172)|
|Sex, female/male||74/98 (43.0/57.0)|
|Age at operation, years||59 (53-68)|
|Preoperative CEA level||2.5 (1.43-6.28)|
|Tumor size, cm||4.75 (3.5-6.5)|
|FL or capecitabine||39 (22.7)|
|FOLFOX or XELOX||69 (40.1)|
|Targeted agents with FOLFOX or FOLFIRI||15 (8.7)|
IQR, interquartile range; CEA, carcinoembryonic antigen; WD, well-differentiated; MD, moderately-differentiated; PD, poorly-differentiated; MSS, microsatellite instability stable; MSI-L, microsatellite instability low; MSI-H, microsatellite instability high; FL, 5-fluorouracil with leucovorin; FOLFOX, combination of 5-FU and oxaliplatin; FOLFIRI, combination of 5-FU and irinotecan.