Publications Search
Explore how scientists all over the world use DrugBank in their research.
Published on May 13, 2022
READ PUBLICATION →

An inductive graph neural network model for compound-protein interaction prediction based on a homogeneous graph.

Authors: Wan X, Wu X, Wang D, Tan X, Liu X, Fu Z, Jiang H, Zheng M, Li X

Abstract: Identifying the potential compound-protein interactions (CPIs) plays an essential role in drug development. The computational approaches for CPI prediction can reduce time and costs of experimental methods and have benefited from the continuously improved graph representation learning. However, most of the network-based methods use heterogeneous graphs, which is challenging due to their complex structures and heterogeneous attributes. Therefore, in this work, we transformed the compound-protein heterogeneous graph to a homogeneous graph by integrating the ligand-based protein representations and overall similarity associations. We then proposed an Inductive Graph AggrEgator-based framework, named CPI-IGAE, for CPI prediction. CPI-IGAE learns the low-dimensional representations of compounds and proteins from the homogeneous graph in an end-to-end manner. The results show that CPI-IGAE performs better than some state-of-the-art methods. Further ablation study and visualization of embeddings reveal the advantages of the model architecture and its role in feature extraction, and some of the top ranked CPIs by CPI-IGAE have been validated by a review of recent literature. The data and source codes are available at https://github.com/wanxiaozhe/CPI-IGAE.
Published on May 13, 2022
READ PUBLICATION →

Design and application of a knowledge network for automatic prioritization of drug mechanisms.

Authors: Mayers M, Tu R, Steinecke D, Li TS, Queralt-Rosinach N, Su AI

Abstract: MOTIVATION: Drug repositioning is an attractive alternative to de novo drug discovery due to reduced time and costs to bring drugs to market. Computational repositioning methods, particularly non-black-box methods that can account for and predict a drug's mechanism, may provide great benefit for directing future development. By tuning both data and algorithm to utilize relationships important to drug mechanisms, a computational repositioning algorithm can be trained to both predict and explain mechanistically novel indications. RESULTS: In this work, we examined the 123 curated drug mechanism paths found in the drug mechanism database (DrugMechDB) and after identifying the most important relationships, we integrated 18 data sources to produce a heterogeneous knowledge graph, MechRepoNet, capable of capturing the information in these paths. We applied the Rephetio repurposing algorithm to MechRepoNet using only a subset of relationships known to be mechanistic in nature and found adequate predictive ability on an evaluation set with AUROC value of 0.83. The resulting repurposing model allowed us to prioritize paths in our knowledge graph to produce a predicted treatment mechanism. We found that DrugMechDB paths, when present in the network were rated highly among predicted mechanisms. We then demonstrated MechRepoNet's ability to use mechanistic insight to identify a drug's mechanistic target, with a mean reciprocal rank of 0.525 on a test set of known drug-target interactions. Finally, we walked through repurposing examples of the anti-cancer drug imatinib for use in the treatment of asthma, and metolazone for use in the treatment of osteoporosis, to demonstrate this method's utility in providing mechanistic insight into repurposing predictions it provides. AVAILABILITY AND IMPLEMENTATION: The Python code to reproduce the entirety of this analysis is available at: https://github.com/SuLab/MechRepoNet (archived at https://doi.org/10.5281/zenodo.6456335). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Published on May 13, 2022
READ PUBLICATION →

Gefitinib and fostamatinib target EGFR and SYK to attenuate silicosis: a multi-omics study with drug exploration.

Authors: Wang M, Zhang Z, Liu J, Song M, Zhang T, Chen Y, Hu H, Yang P, Li B, Song X, Pang J, Xing Y, Cao Z, Guo W, Yang H, Wang J, Yang J, Wang C

Abstract: Silicosis is the most prevalent and fatal occupational disease with no effective therapeutics, and currently used drugs cannot reverse the disease progress. Worse still, there are still challenges to be addressed to fully decipher the intricated pathogenesis. Thus, specifying the essential mechanisms and targets in silicosis progression then exploring anti-silicosis pharmacuticals are desperately needed. In this work, multi-omics atlas was constructed to depict the pivotal abnormalities of silicosis and develop targeted agents. By utilizing an unbiased and time-resolved analysis of the transcriptome, proteome and phosphoproteome of a silicosis mouse model, we have verified the significant differences in transcript, protein, kinase activity and signaling pathway level during silicosis progression, in which the importance of essential biological processes such as macrophage activation, chemotaxis, immune cell recruitment and chronic inflammation were emphasized. Notably, the phosphorylation of EGFR (p-EGFR) and SYK (p-SYK) were identified as potential therapeutic targets in the progression of silicosis. To inhibit and validate these targets, we tested fostamatinib (targeting SYK) and Gefitinib (targeting EGFR), and both drugs effectively ameliorated pulmonary dysfunction and inhibited the progression of inflammation and fibrosis. Overall, our drug discovery with multi-omics approach provides novel and viable therapeutic strategies for the treatment of silicosis.
Published on May 13, 2022
READ PUBLICATION →

Identification of Carcinogenesis and Tumor Progression Processes in Pancreatic Ductal Adenocarcinoma Using High-Throughput Proteomics.

Authors: Trilla-Fuertes L, Gamez-Pozo A, Lumbreras-Herrera MI, Lopez-Vacas R, Heredia-Soto V, Ghanem I, Lopez-Camacho E, Zapater-Moros A, Miguel M, Pena-Burgos EM, Palacios E, De Uribe M, Guerra L, Dittmann A, Mendiola M, Fresno Vara JA, Feliu J

Abstract: Pancreatic ductal adenocarcinoma (PDAC) is an aggressive disease with an overall 5-year survival rate of just 5%. A better understanding of the carcinogenesis processes and the mechanisms of the progression of PDAC is mandatory. Fifty-two PDAC patients treated with surgery and adjuvant therapy, with available primary tumors, normal tissue, preneoplastic lesions (PanIN), and/or lymph node metastases, were selected for the study. Proteins were extracted from small punches and analyzed by LC-MS/MS using data-independent acquisition. Proteomics data were analyzed using probabilistic graphical models, allowing functional characterization. Comparisons between groups were made using linear mixed models. Three proteomic tumor subtypes were defined. T1 (32% of patients) was related to adhesion, T2 (34%) had metabolic features, and T3 (34%) presented high splicing and nucleoplasm activity. These proteomics subtypes were validated in the PDAC TCGA cohort. Relevant biological processes related to carcinogenesis and tumor progression were studied in each subtype. Carcinogenesis in the T1 subtype seems to be related to an increase of adhesion and complement activation node activity, whereas tumor progression seems to be related to nucleoplasm and translation nodes. Regarding the T2 subtype, it seems that metabolism and, especially, mitochondria act as the motor of cancer development. T3 analyses point out that nucleoplasm, mitochondria and metabolism, and extracellular matrix nodes could be involved in T3 tumor carcinogenesis. The identified processes were different among proteomics subtypes, suggesting that the molecular motor of the disease is different in each subtype. These differences can have implications for the development of future tailored therapeutic approaches for each PDAC proteomics subtype.
Published on May 13, 2022
READ PUBLICATION →

MSPEDTI: Prediction of Drug-Target Interactions via Molecular Structure with Protein Evolutionary Information.

Authors: Wang L, Wong L, Chen ZH, Hu J, Sun XF, Li Y, You ZH

Abstract: The key to new drug discovery and development is first and foremost the search for molecular targets of drugs, thus advancing drug discovery and drug repositioning. However, traditional drug-target interactions (DTIs) is a costly, lengthy, high-risk, and low-success-rate system project. Therefore, more and more pharmaceutical companies are trying to use computational technologies to screen existing drug molecules and mine new drugs, leading to accelerating new drug development. In the current study, we designed a deep learning computational model MSPEDTI based on Molecular Structure and Protein Evolutionary to predict the potential DTIs. The model first fuses protein evolutionary information and drug structure information, then a deep learning convolutional neural network (CNN) to mine its hidden features, and finally accurately predicts the associated DTIs by extreme learning machine (ELM). In cross-validation experiments, MSPEDTI achieved 94.19%, 90.95%, 87.95%, and 86.11% prediction accuracy in the gold-standard datasets enzymes, ion channels, G-protein-coupled receptors (GPCRs), and nuclear receptors, respectively. MSPEDTI showed its competitive ability in ablation experiments and comparison with previous excellent methods. Additionally, 7 of 10 potential DTIs predicted by MSPEDTI were substantiated by the classical database. These excellent outcomes demonstrate the ability of MSPEDTI to provide reliable drug candidate targets and strongly facilitate the development of drug repositioning and drug development.
Published on May 13, 2022
READ PUBLICATION →

Proteogenomic characterization of 2002 human cancers reveals pan-cancer molecular subtypes and associated pathways.

Authors: Zhang Y, Chen F, Chandrashekar DS, Varambally S, Creighton CJ

Abstract: Mass-spectrometry-based proteomic data on human tumors-combined with corresponding multi-omics data-present opportunities for systematic and pan-cancer proteogenomic analyses. Here, we assemble a compendium dataset of proteomics data of 2002 primary tumors from 14 cancer types and 17 studies. Protein expression of genes broadly correlates with corresponding mRNA levels or copy number alterations (CNAs) across tumors, but with notable exceptions. Based on unsupervised clustering, tumors separate into 11 distinct proteome-based subtypes spanning multiple tissue-based cancer types. Two subtypes are enriched for brain tumors, one subtype associating with MYC, Wnt, and Hippo pathways and high CNA burden, and another subtype associating with metabolic pathways and low CNA burden. Somatic alteration of genes in a pathway associates with higher pathway activity as inferred by proteome or transcriptome data. A substantial fraction of cancers shows high MYC pathway activity without MYC copy gain but with mutations in genes with noncanonical roles in MYC. Our proteogenomics survey reveals the interplay between genome and proteome across tumor lineages.
Published on May 13, 2022
READ PUBLICATION →

RoFDT: Identification of Drug-Target Interactions from Protein Sequence and Drug Molecular Structure Using Rotation Forest.

Authors: Wang Y, Wang L, Wong L, Zhao B, Su X, Li Y, You Z

Abstract: As the basis for screening drug candidates, the identification of drug-target interactions (DTIs) plays a crucial role in the innovative drugs research. However, due to the inherent constraints of small-scale and time-consuming wet experiments, DTI recognition is usually difficult to carry out. In the present study, we developed a computational approach called RoFDT to predict DTIs by combining feature-weighted Rotation Forest (FwRF) with a protein sequence. In particular, we first encode protein sequences as numerical matrices by Position-Specific Score Matrix (PSSM), then extract their features utilize Pseudo Position-Specific Score Matrix (PsePSSM) and combine them with drug structure information-molecular fingerprints and finally feed them into the FwRF classifier and validate the performance of RoFDT on Enzyme, GPCR, Ion Channel and Nuclear Receptor datasets. In the above dataset, RoFDT achieved 91.68%, 84.72%, 88.11% and 78.33% accuracy, respectively. RoFDT shows excellent performance in comparison with support vector machine models and previous superior approaches. Furthermore, 7 of the top 10 DTIs with RoFDT estimate scores were proven by the relevant database. These results demonstrate that RoFDT can be employed to a powerful predictive approach for DTIs to provide theoretical support for innovative drug discovery.
Published on May 13, 2022
READ PUBLICATION →

SynLethDB 2.0: a web-based knowledge graph database on synthetic lethality for novel anticancer drug discovery.

Authors: Wang J, Wu M, Huang X, Wang L, Zhang S, Liu H, Zheng J

Abstract: Two genes are synthetic lethal if mutations in both genes result in impaired cell viability, while mutation of either gene does not affect the cell survival. The potential usage of synthetic lethality (SL) in anticancer therapeutics has attracted many researchers to identify synthetic lethal gene pairs. To include newly identified SLs and more related knowledge, we present a new version of the SynLethDB database to facilitate the discovery of clinically relevant SLs. We extended the first version of SynLethDB database significantly by including new SLs identified through Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) screening, a knowledge graph about human SLs, a new web interface, etc. Over 16 000 new SLs and 26 types of other relationships have been added, encompassing relationships among 14 100 genes, 53 cancers, 1898 drugs, etc. Moreover, a brand-new web interface has been developed to include modules such as SL query by disease or compound, SL partner gene set enrichment analysis and knowledge graph browsing through a dynamic graph viewer. The data can be downloaded directly from the website or through the RESTful Application Programming Interfaces (APIs). Database URL: https://synlethdb.sist.shanghaitech.edu.cn/v2.
Published on May 13, 2022
READ PUBLICATION →

D3AI-CoV: a deep learning platform for predicting drug targets and for virtual screening against COVID-19.

Authors: Yang Y, Zhou D, Zhang X, Shi Y, Han J, Zhou L, Wu L, Ma M, Li J, Peng S, Xu Z, Zhu W

Abstract: Target prediction and virtual screening are two powerful tools of computer-aided drug design. Target identification is of great significance for hit discovery, lead optimization, drug repurposing and elucidation of the mechanism. Virtual screening can improve the hit rate of drug screening to shorten the cycle of drug discovery and development. Therefore, target prediction and virtual screening are of great importance for developing highly effective drugs against COVID-19. Here we present D3AI-CoV, a platform for target prediction and virtual screening for the discovery of anti-COVID-19 drugs. The platform is composed of three newly developed deep learning-based models i.e., MultiDTI, MPNNs-CNN and MPNNs-CNN-R models. To compare the predictive performance of D3AI-CoV with other methods, an external test set, named Test-78, was prepared, which consists of 39 newly published independent active compounds and 39 inactive compounds from DrugBank. For target prediction, the areas under the receiver operating characteristic curves (AUCs) of MultiDTI and MPNNs-CNN models are 0.93 and 0.91, respectively, whereas the AUCs of the other reported approaches range from 0.51 to 0.74. For virtual screening, the hit rate of D3AI-CoV is also better than other methods. D3AI-CoV is available for free as a web application at http://www.d3pharma.com/D3Targets-2019-nCoV/D3AI-CoV/index.php, which can serve as a rapid online tool for predicting potential targets for active compounds and for identifying active molecules against a specific target protein for COVID-19 treatment.
Published on May 12, 2022
READ PUBLICATION →

Hierarchical network analysis of co-occurring bioentities in literature.

Authors: Yang H, Lee N, Park B, Park J, Lee J, Jang HS, Yoo H

Abstract: Biomedical databases grow by more than a thousand new publications every day. The large volume of biomedical literature that is being published at an unprecedented rate hinders the discovery of relevant knowledge from keywords of interest to gather new insights and form hypotheses. A text-mining tool, PubTator, helps to automatically annotate bioentities, such as species, chemicals, genes, and diseases, from PubMed abstracts and full-text articles. However, the manual re-organization and analysis of bioentities is a non-trivial and highly time-consuming task. ChexMix was designed to extract the unique identifiers of bioentities from query results. Herein, ChexMix was used to construct a taxonomic tree with allied species among Korean native plants and to extract the medical subject headings unique identifier of the bioentities, which co-occurred with the keywords in the same literature. ChexMix discovered the allied species related to a keyword of interest and experimentally proved its usefulness for multi-species analysis.