Medicinal herbs have been used extensively for the treatment of various ailments in ancient medical traditions of China (traditional Chinese medicine, TCM), Korea (traditional Korean medicine), Japan (Kampo medicine), India (Ayurveda), Indonesia (Jamu), North America (phytotherapy), and Europe (herbalism) (1). Standardized herbal extracts (hereafter referred to as herbs) or herbal formulae that blend several herbs into a single formula are composed of a variety of bioactive chemical compounds. They provide a fertile ground for modern drug development with therapeutic leads. Antimalarial quinine and artemisinin, antipyretic analgesic aspirin, and arsenic trioxide for leukemia are examples of modern drugs originally used in traditional medicine (2). In general, pharmacological effects of herbal medicine are achieved by their active ingredients that simultaneously modulate multiple biomolecules in the human body via an additive or synergistic manner. This multi-component nature of herbal medicine has been considered advantageous over single-target drugs for treating complex multifactorial disorders such as cancer and nervous system disease (3, 4). Although therapeutic effects of herbal medicine have been clinically verified in traditional settings for thousands of years, its unknown mode of action on the human body hinders its application and development.
Along with great progress in systems biology, systems pharmacology or network pharmacology approaches have been introduced to decipher complex mechanisms of action (MOAs) of drugs in networks of biomolecules that interact with the drugs (5-7). These approaches have been extensively applied to explore pharmacological effects of multi-component herbal medicine acting on multiple targets of disease from a holistic perspective (8). A typical network pharmacology approach starts from constructing a network in which a node represents either an herb, herbal ingredient, or target protein/gene and an edge indicates link between herb and its constituent compound or interaction between compound and its target (Fig. 1). By associating targets with biological functions or diseases, an herb-compound-target-function/disease network is constructed and leveraged to study the MOA of an herbal treatment. Protein-protein interaction (PPI) network could be further employed to interpret the synergistic effect of herbal medicine by analyzing interactions among target proteins (9, 10). As this network-based approach is solely based on the network constructed, a significant factor influencing subsequent analysis is the reliability of connections (herb-compound, compound-target, and target-disease) in the network. However, establishing reliable connections requires laborious tasks such as identifying herbal ingredients and their molecular targets (11). Thus, most network pharmacology studies only cover a fraction of herbal compounds quantified and rely on target annotations based on
One emerging alternative to address these issues is leveraging drug-induced transcriptome data. The gene expression profile induced by herbal treatment reflects genome-wide effects of multi-component herbs in a certain biological system, thereby providing comprehensive and reliable associations between herbs and genes (Fig. 1) (12). A typical application of drug-induced transcriptome data was first introduced systematically by the connectivity map (CMap) (13). CMap currently provides large-scale gene expression profiles before and after treatment with ∼33,000 small molecule compounds in 230 human cell lines and periodically releases additional data (14). It has been widely utilized in research to retrieve drug repositioning candidates and to elucidate drug’s MOA in modern and herbal medicine (15, 16). With a similar concept, drug-induced transcriptome data from MCF-7 breast cancer cell line treated with each of 102 TCM components (TCM102) have been published (17), enabling researchers to explore the activity of TCM ingredients at the molecular and cellular levels (18, 19). As more and more herbal medicine studies conducting high-throughput transcriptome profiling have been published, numbers of gene expression data sets of herbs/ingredients generated in various experimental settings have been accumulated. To integrate these data, Fang S and colleagues have collected transcriptome data sets of 20 herbs and 152 ingredients and built an organized database, HERB, a high-throughput experiment- and reference-guided database of TCM (20). As demonstrated by the increasing demand for these key resources, analyzing transcriptome data of herbal treatment can efficiently uncover novel associations between herbal medicine and modern drugs, genes, and diseases, which in turn can encourage the application and development of herbal medicine.
Systems pharmacology approaches using network-based methods or drug-induced transcriptome data are increasingly adopted pivotally in herbal medicine research. In this review, we summarize key databases and computational methods used in systems pharmacology for studying herbal medicine. We also highlight the recent application of pharmaco-transcriptomics in herbal medicine research.
Systems pharmacology approaches for herbal medicine usually start with integrating current knowledge on different types of data (including herbs, compounds, targets, pathways, and diseases) and organizing them into a network. Typically, in the network, a node represents an herb, compound, target, pathway, or disease and an edge represents an interaction between nodes. Numerous databases provide information on the nodes or edges, each database containing data of different scopes and evidence levels. These databases can be divided into four types: herb-related (HRDB), compound-related (CRDB), target-related (TRDB), and disease-related (DRDB) databases (Table 1). Information on herbal properties and herbal ingredients can be obtained from HRDBs, such as TCMID (21), TCMSP (22), BATMAN-TCM (23), and SYMMAP (24). Most HRDBs provide not only information on herbal ingredients, but also ingredient-related targets, pathways, and diseases. For this reason, HRDBs have been mainly utilized for network pharmacology analysis of herbal medicine. The CRDB includes databases for compounds, compound-target interaction (CTI), and compound-induced transcriptome data. PubChem (25) and SwissADME (26) contain information on physicochemical descriptors, pharmacokinetic properties, and ADME (absorption, distribution, metabolism, and excretion) parameters of compounds, which can be used to calculate drug-likeness values of herbal compounds. SwissTargetPrediction (27), STITCH (28), and Therapeutic Target Database (TTD) (29) provide known or predicted CTI information. Although high-throughput targeted assays have been developed to screen for drug targets, it is a fairly arduous task to identify binding targets on a genome-wide scale, in some cases for hundreds of herbal ingredients. Therefore, some CRDBs additionally provide information on CTIs predicted by using machine learning (30) or similarity (31) based on structures of compounds and targets. Among TRDBs, Uniprot (32) and GeneCards (33) provide information on the sequences and functional roles of proteins/genes. KEGG (34), Reactome (35), and gene ontology (36) provide sets of genes classified by their biological functions. These databases are utilized to perform functional enrichment analysis, such as GSEA (gene set enrichment analysis) (37). In addition, STRING (38) and Human Protein Reference Database (39) provide known PPI information. PPI information can be integrated with CTI to construct a target network in which potential drug targets are identified as interacted protein modules in the network (40). DRDB provides collections of genes and variants associated with diseases. DRDB is also widely used because one of the ultimate goals of systems pharmacology is to predict and evaluate therapeutic effects of drugs by exploring the relationship between drugs and diseases.
With the introduction of polypharmacology, the paradigm of drug research has shifted from single-target to multi-target strategies, revealing the potential of multicomponent herbal medicines to treat a variety of multifactorial disorders (44-46). In line with this, various computational approaches have been introduced to identify targets, indications, and/or synergistic combinations of herbal medicines (Table 2).
Network-based methods have been most widely applied to predict potential targets of herbs or herbal formulae. For example, Wang
Several machine learning methods have also been employed to predict herb-target interactions (49-51). Wang
Active compounds stemming from medicinal herbs are appealing in modern drug development due to their high efficacies and low toxicities (2, 54). However, new therapeutic opportunities for numerous herbal compounds are yet to be identified. In this section, we will review a few studies applying state-of-the-art machine learning methods to prioritize therapeutically effective herbal compounds for several diseases (55-58).
Synergy is one of the major advantages of multi-component herbal medicine. Network-based strategies enable us to efficiently explore herb combinations and to better infer the mechanisms of synergistic action of herbs or herbal compounds. For example, Li
As an another example, Wang
Herbal prescriptions are combinations of herbal formulas for treatment, meaning that various ingredients in these formulas have the potential to affect multiple genes and biological pathways. In recent decades, systems pharmacology studies applying transcriptome analysis after treatment with herbal medicines
Si-Wu-Tang (SWT) (Samul-tang in Korean, Shimotsu-to in Japanese) is one of the most popular herbal prescriptions consisting of four herbs including
Tao-Hong-Si-Wu Decoction (THSWD), a traditional herbal medicine, is composed of six herbs (
Compound Kushen Injection (CKI) is an approved Chinese patented drug in adjuvant treatment for chemotherapy. It consists of extracts of two herbs, Kushen (
Feifukang (FFK) is a pulmonary rehabilitation mixture comprising eight herbs for protecting lung function:
These individual studies on specific herbal formulae or an herb have laid the basis for developing efficient strategies to systematically infer MOA of herbal medicines at the molecular level, which may rationalize and modernize herbal medicines ultimately.
Another great advantage of obtaining drug-induced transcriptome data of herbs/ingredients is that novel indications of herbal medicine can be rapidly screened computationally by a systems-based approach. A systems-based approach involves modulating a list of abnormally expressed genes in disease, in contrast to a traditional target-based approach which involves modulating the molecular state of one single protein (108). This approach was first designed and introduced to the public by CMap to link drugs and diseases (13). It defines a set of abnormally expressed genes in a disease, termed a disease signature, and queries it in the CMap reference database. It then searches for drugs that inversely regulate the expression of the disease signature, that is, those that decrease the expression of upregulated disease genes and increase the expression of downregulated disease genes. These drugs are considered candidates for reversing the diseased state back to the normal state.
Several studies using this approach have demonstrated its applicability to drug repositioning of herbal compounds (109, 110). Luo
TCM102 database is also widely utilized in research to discover new indications of herbal compounds based on the systems-based approach (18, 19). For example, Li
These systems-based approaches have been mainly conducted by screening desired compounds using well-organized databases, such as CMap and TCM102. However, since these databases only contain data on small molecule compounds, herbs or herbal formulae are inevitably excluded from the screening. The expanded database including a variety of medicinal herbs would offer clues to identify evidence-based connections between herbs and diseases, hence spurring the application and development of herbal medicines.
Systems pharmacology approach is increasingly adopted and developed in a wide range of modern drug development processes to better understand molecular MOA of drugs in the human body. Although this approach would also lend itself to herbal medicine research, its practical application and development are relatively slow. The main reason is that the data on herbs and herbal medicines themselves are insufficient for systems pharmacology approach to directly utilize. Whereas for modern drugs, CMap alone provides drug-induced transcriptome data for ∼40,000 small molecules, and furthermore, the Library of Integrated Network-Based Cellular Signatures (LINCS) project is continuously generating drug-related multi-omics data sets including proteome, epigenome, and metabolome data for a comprehensive understanding of drug MOA. The advantage of such large-scale data is that we can rapidly utilize them to repurpose existing drugs in urgent situations. For example, several approved drugs have been proposed as candidates for clinical intervention to combat rapidly emerging diseases such as COVID-19 through
This work has been supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2021R1C1C1003988) and the research program of the Korea Institute of Oriental Medicine (KIOM) (KSN2023120 and KSN2022240).
The authors have no conflicting interests.
Public databases widely used in herbal medicine research
|Type||Database||Numbers of available data||Website or reference|
|Herb-related database||TCMID (version 2.0)||46,929 TAM prescriptions
8,159 herbs/43,413 compounds
|TCMSP (version 2.3)||501 herbs/13,144 compounds
3,311 targets/837 diseases
29,384 HC pairs/84,260 CT pairs
2,387 TD pairs
|SYMMAP (version 2.0)||698 herbs/26,035 compounds
20,965 targets/14,086 diseases
2,518 TAM symptoms
1,148 MM symptoms
|Compound-related database||PubChem (2021)||111 million compounds
278 million substances
295 million bioactivities
|STITCH (v5.0)||430,000 compounds
|CMap||33,000 compounds/230 cell lines||https://clue.io/|
1 cell line (MCF-7)
|HERB||7,263 herbs/28,212 compounds
12933 targets/49,258 phenotypes
6,164 gene expression profiles
|Target-related database||UniProt (2020.04)||292,000 proteins (190 million sequences)||https://www.uniprot.org/|
|KEGG (2022.03.24)||551 biological pathways||https://www.genome.jp/kegg/|
|Gene Ontology (2022.03.22)||7,838,790 gene sets involved in biological process, molecular function, and cellular components||http://geneontology.org/|
|STRING (v11.5)||24,584,628 proteins
|Disease-related database||DisGeNet (v7.0)||21,671 genes/30,170 diseases
1,134,942 gene-disease associations
|OMIM (2022.05.02)||16,730 genes/6,378 phenotypes||https://www.omim.org/ (42)|
|Human Phenotype Ontology (2022.04)||4,791 genes/10,274 phenotypes||https://hpo.jax.org (43)|
TAM, traditional Asian medicine; MM, modern medicine.
Computational approaches for studying herbal medicine
|Reference||Prediction type||Data sources utilized|
||Herb-target interactions||Ligand-target interaction prediction (30)||TCMSP (22), DrugBank (64), PharmGKB (65), TTD (29)|
||Herb-target interactions||Ligand-target interaction prediction (30)||TCMSP, DrugBank, CMap|
||Herb-target interactions||node2vec, KNN, SVM, RF, LR, DT, GBDT||HIT (66), Chinese pharmacopoeia, SIDER (67), MalaCards (68), DrugBank, SemMedDB (69), Zhou
||Herb-target interactions||GNN (71), meta relation-based attention mechanism||HeNetRW (72), YaTCM (73), TCMIP (74)|
||Herb-target interactions||BLM (53), SVM||DrugBank, TCMID (21), TCM-ID (75), KTKP (http://www.koreantk.com), KAMPO (http://kampo.ca/), ChemSpider (http://www.chemspider.com/)|
||Indications of herbal compounds||RWR, hierarchical clustering||OMIM (42), KTKP, MeSH, TCMID, TCMSP, TCM@Taiwan (76), TCM-ID, KAMPO|
||Indications of herbal compounds||RWR, DNN||MeSH, OMIM, KTKP, TCMID, COCONUT (77), FooDB (http://foodb.ca/), DrugBank, CTD (78), MATADOR (79), STITCH (28), TTD, BioGrid (80)|
||Indications of herbal compounds||LR, RF, SVM||DrugBank, OMIM, SIDER, OFFSIDES (81), STITCH,UniProt (82), DGIdb (83), HPO (84), DisGeNet (85), KTKP, TCMID, TCM-ID, KAMPO, BindingDB (86), MATADOR|
||Effective combination of herbs||Ligand-target interaction prediction (30)||DrugBank, TTD, TCMSP|
||Synergistic MOA of herbs||Network proximity measure (87)||TCMID, STITCH, Cheng
KNN, K-Nearest Neighbor; SVM, support vector machine; RF, Random forest; LR, Logistic Regression; DT, Decision Tree; GBDT, Gradient Boosting Decision Tree; GNN, Graph Neural Network; BLM, Bipartite Local Model; RWR, Random walk with restart; DNN, Deep Neural Network; GTB, Gradient Tree Boosting; OMIM, Online Mendelian Inheritance in Man; MOA, mechanism of action.