Publications Search
Explore how scientists all over the world use DrugBank in their research.
Published on July 2, 2020

SYNERGxDB: an integrative pharmacogenomic portal to identify synergistic drug combinations for precision oncology.

Authors: Seo H, Tkachuk D, Ho C, Mammoliti A, Rezaie A, Madani Tonekaboni SA, Haibe-Kains B

Abstract: Drug-combination data portals have recently been introduced to mine huge amounts of pharmacological data with the aim of improving current chemotherapy strategies. However, these portals have only been investigated for isolated datasets, and molecular profiles of cancer cell lines are lacking. Here we developed a cloud-based pharmacogenomics portal called SYNERGxDB ( that integrates multiple high-throughput drug-combination studies with molecular and pharmacological profiles of a large panel of cancer cell lines. This portal enables the identification of synergistic drug combinations through harmonization and unified computational analysis. We integrated nine of the largest drug combination datasets from both academic groups and pharmaceutical companies, resulting in 22 507 unique drug combinations (1977 unique compounds) screened against 151 cancer cell lines. This data compendium includes metabolomics, gene expression, copy number and mutation profiles of the cancer cell lines. In addition, SYNERGxDB provides analytical tools to discover effective therapeutic combinations and predictive biomarkers across cancer, including specific types. Combining molecular and pharmacological profiles, we systematically explored the large space of univariate predictors of drug synergism. SYNERGxDB constitutes a comprehensive resource that opens new avenues of research for exploring the mechanism of action for drug synergy with the potential of identifying new treatment strategies for cancer patients.
Published on July 2, 2020

ASAP 2020 update: an open, scalable and interactive web-based portal for (single-cell) omics analyses.

Authors: David FPA, Litovchenko M, Deplancke B, Gardeux V

Abstract: Single-cell omics enables researchers to dissect biological systems at a resolution that was unthinkable just 10 years ago. However, this analytical revolution also triggered new demands in 'big data' management, forcing researchers to stay up to speed with increasingly complex analytical processes and rapidly evolving methods. To render these processes and approaches more accessible, we developed the web-based, collaborative portal ASAP (Automated Single-cell Analysis Portal). Our primary goal is thereby to democratize single-cell omics data analyses (scRNA-seq and more recently scATAC-seq). By taking advantage of a Docker system to enhance reproducibility, and novel bioinformatics approaches that were recently developed for improving scalability, ASAP meets challenging requirements set by recent cell atlasing efforts such as the Human (HCA) and Fly (FCA) Cell Atlas Projects. Specifically, ASAP can now handle datasets containing millions of cells, integrating intuitive tools that allow researchers to collaborate on the same project synchronously. ASAP tools are versioned, and researchers can create unique access IDs for storing complete analyses that can be reproduced or completed by others. Finally, ASAP does not require any installation and provides a full and modular single-cell RNA-seq analysis pipeline. ASAP is freely available at
Published on July 2, 2020

ToxicoDB: an integrated database to mine and visualize large-scale toxicogenomic datasets.

Authors: Nair SK, Eeles C, Ho C, Beri G, Yoo E, Tkachuk D, Tang A, Nijrabi P, Smirnov P, Seo H, Jennen D, Haibe-Kains B

Abstract: In the past few decades, major initiatives have been launched around the world to address chemical safety testing. These efforts aim to innovate and improve the efficacy of existing methods with the long-term goal of developing new risk assessment paradigms. The transcriptomic and toxicological profiling of mammalian cells has resulted in the creation of multiple toxicogenomic datasets and corresponding tools for analysis. To enable easy access and analysis of these valuable toxicogenomic data, we have developed ToxicoDB (, a free and open cloud-based platform integrating data from large in vitro toxicogenomic studies, including gene expression profiles of primary human and rat hepatocytes treated with 231 potential toxicants. To efficiently mine these complex toxicogenomic data, ToxicoDB provides users with harmonized chemical annotations, time- and dose-dependent plots of compounds across datasets, as well as the toxicity-related pathway analysis. The data in ToxicoDB have been generated using our open-source R package, ToxicoGx ( Altogether, ToxicoDB provides a streamlined process for mining highly organized, curated, and accessible toxicogenomic data that can be ultimately applied to preclinical toxicity studies and further our understanding of adverse outcomes.
Published on July 2, 2020

SIB Literature Services: RESTful customizable search engines in biomedical literature, enriched with automatically mapped biomedical concepts.

Authors: Gobeill J, Caucheteur D, Michel PA, Mottin L, Pasche E, Ruch P

Abstract: Thanks to recent efforts by the text mining community, biocurators have now access to plenty of good tools and Web interfaces for identifying and visualizing biomedical entities in literature. Yet, many of these systems start with a PubMed query, which is limited by strong Boolean constraints. Some semantic search engines exploit entities for Information Retrieval, and/or deliver relevance-based ranked results. Yet, they are not designed for supporting a specific curation workflow, and allow very limited control on the search process. The Swiss Institute of Bioinformatics Literature Services (SIBiLS) provide personalized Information Retrieval in the biological literature. Indeed, SIBiLS allow fully customizable search in semantically enriched contents, based on keywords and/or mapped biomedical entities from a growing set of standardized and legacy vocabularies. The services have been used and favourably evaluated to assist the curation of genes and gene products, by delivering customized literature triage engines to different curation teams. SIBiLS ( are freely accessible via REST APIs and are ready to empower any curation workflow, built on modern technologies scalable with big data: MongoDB and Elasticsearch. They cover MEDLINE and PubMed Central Open Access enriched by nearly 2 billion of mapped biomedical entities, and are daily updated.
Published on July 2, 2020

A Network Medicine Approach to Investigation and Population-based Validation of Disease Manifestations and Drug Repurposing for COVID-19.

Authors: Zhou Y, Hou Y, Shen J, Kallianpur A, Zein J, Culver DA, Farha S, Comhair S, Fiocchi C, Gack MU, Mehra R, Stappenbeck T, Chan T, Eng C, Jung JU, Jehi L, Erzurum S, Cheng F

Abstract: The global Coronavirus Disease 2019 (COVID-19) pandemic, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has led to unprecedented social and economic consequences. The risk of morbidity and mortality due to COVID-19 increases dramatically in the presence of co-existing medical conditions while the underlying mechanisms remain unclear. Furthermore, there are no proven effective therapies for COVID-19. This study aims to identify SARS-CoV-2 pathogenesis, diseases manifestations, and COVID-19 therapies using network medicine methodologies along with clinical and multi-omics observations. We incorporate SARS-CoV-2 virus-host protein-protein interactions, transcriptomics, and proteomics into the human interactome. Network proximity measure revealed underlying pathogenesis for broad COVID-19-associated manifestations. Multi-modal analyses of single-cell RNA-sequencing data showed that co-expression of ACE2 and TMPRSS2 was elevated in absorptive enterocytes from the inflamed ileal tissues of Crohn's disease patients compared to uninflamed tissues, revealing shared pathobiology by COVID-19 and inflammatory bowel disease. Integrative analyses of metabolomics and transcriptomics (bulk and single-cell) data from asthma patients indicated that COVID-19 shared intermediate inflammatory endophenotypes with asthma (including IRAK3 and ADRB2). To prioritize potential treatment, we combined network-based prediction and propensity score (PS) matching observational study of 18,118 patients from a COVID-19 registry. We identified that melatonin (odds ratio (OR) = 0.36, 95% confidence interval (CI) 0.22-0.59) was associated with 64% reduced likelihood of a positive laboratory test result for SARS-CoV-2. Using PS-matching user active comparator design, melatonin was associated with 54% reduced likelihood of SARS-CoV-2 positive test result compared to angiotensin II receptor blockers or angiotensin-converting enzyme inhibitors (OR = 0.46, 95% CI 0.24-0.86).
Published on July 1, 2020

Identification of a druggable binding pocket in the spike protein reveals a key site for existing drugs potentially capable of combating Covid-19 infectivity.

Authors: Drew ED, Janes RW

Abstract: BACKGROUND: Following the recent outbreak of the new coronavirus pandemic (Covid-19), the rapid determination of the structure of the homo-trimeric spike glycoprotein has prompted the study reported here. The aims were to identify potential "druggable" binding pockets in the protein and, if located, to virtual screen pharmaceutical agents currently in use for predicted affinity to these pockets which might be useful to restrict, reduce, or inhibit the infectivity of the virion. RESULTS: Our analyses of this structure have revealed a key potentially druggable pocket where it might be viable to bind pharmaceutical agents to inhibit its ability to infect human cells. This pocket is found at the inter-chain interface that exists between two domains prior to the virion binding to human Angiotensin Converting Enzyme 2 (ACE2) protein. One of these domains is the highly mobile receptor binding domain, which must move into position to interact with ACE2, which is an essential feature for viral entry to the host cell. Virtual screening with a library of purchasable drug molecules has identified pharmaceuticals currently in use as prescription and over the counter medications that, in silico, readily bind into this pocket. CONCLUSIONS: This study highlights possible drugs already in use as pharmaceuticals that may act as agents to interfere with the movements of the domains within this protein essential for the infectivity processes and hence might slow, or even halt, the infection of host cells by this new coronavirus. As these are existing pharmaceuticals already approved for use in humans, this knowledge could accelerate their roll-out, through repurposing, for affected individuals and help guide the efforts of other researchers in finding effective treatments for the disease.
Published on July 1, 2020

PathWalks: identifying pathway communities using a disease-related map of integrated information.

Authors: Karatzas E, Zachariou M, Bourdakou MM, Minadakis G, Oulas A, Kolios G, Delis A, Spyrou GM

Abstract: MOTIVATION: Understanding the underlying biological mechanisms and respective interactions of a disease remains an elusive, time consuming and costly task. Computational methodologies that propose pathway/mechanism communities and reveal respective relationships can be of great value as they can help expedite the process of identifying how perturbations in a single pathway can affect other pathways. RESULTS: We present a random-walks-based methodology called PathWalks, where a walker crosses a pathway-to-pathway network under the guidance of a disease-related map. The latter is a gene network that we construct by integrating multi-source information regarding a specific disease. The most frequent trajectories highlight communities of pathways that are expected to be strongly related to the disease under study.We apply the PathWalks methodology on Alzheimer's disease and idiopathic pulmonary fibrosis and establish that it can highlight pathways that are also identified by other pathway analysis tools as well as are backed through bibliographic references. More importantly, PathWalks produces additional new pathways that are functionally connected with those already established, giving insight for further experimentation. AVAILABILITY AND IMPLEMENTATION: SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Published on July 1, 2020

The My Cancer Genome clinical trial data model and trial curation workflow.

Authors: Jain N, Mittendorf KF, Holt M, Lenoue-Newton M, Maurer I, Miller C, Stachowiak M, Botyrius M, Cole J, Micheel C, Levy M

Abstract: OBJECTIVE: As clinical trials evolve in complexity, clinical trial data models that can capture relevant trial data in meaningful, structured annotations and computable forms are needed to support accrual. MATERIAL AND METHODS: We have developed a clinical trial information model, curation information system, and a standard operating procedure for consistent and accurate annotation of cancer clinical trials. Clinical trial documents are pulled into the curation system from publicly available sources. Using a web-based interface, a curator creates structured assertions related to disease-biomarker eligibility criteria, therapeutic context, and treatment cohorts by leveraging our data model features. These structured assertions are published on the My Cancer Genome (MCG) website. RESULTS: To date, over 5000 oncology trials have been manually curated. All trial assertion data are available for public view on the MCG website. Querying our structured knowledge base, we performed a landscape analysis to assess the top diseases, biomarker alterations, and drugs featured across all cancer trials. DISCUSSION: Beyond curating commonly captured elements, such as disease and biomarker eligibility criteria, we have expanded our model to support the curation of trial interventions and therapeutic context (ie, neoadjuvant, metastatic, etc.), and the respective biomarker-disease treatment cohorts. To the best of our knowledge, this is the first effort to capture these fields in a structured format. CONCLUSION: This paper makes a significant contribution to the field of biomedical informatics and knowledge dissemination for precision oncology via the MCG website. KEY WORDS: knowledge representation, My Cancer Genome, precision oncology, knowledge curation, cancer informatics, clinical trial data model.
Published on July 1, 2020

Combining phenome-driven drug-target interaction prediction with patients' electronic health records-based clinical corroboration toward drug discovery.

Authors: Zhou M, Zheng C, Xu R

Abstract: MOTIVATION: Predicting drug-target interactions (DTIs) using human phenotypic data have the potential in eliminating the translational gap between animal experiments and clinical outcomes in humans. One challenge in human phenome-driven DTI predictions is integrating and modeling diverse drug and disease phenotypic relationships. Leveraging large amounts of clinical observed phenotypes of drugs and diseases and electronic health records (EHRs) of 72 million patients, we developed a novel integrated computational drug discovery approach by seamlessly combining DTI prediction and clinical corroboration. RESULTS: We developed a network-based DTI prediction system (TargetPredict) by modeling 855 904 phenotypic and genetic relationships among 1430 drugs, 4251 side effects, 1059 diseases and 17 860 genes. We systematically evaluated TargetPredict in de novo cross-validation and compared it to a state-of-the-art phenome-driven DTI prediction approach. We applied TargetPredict in identifying novel repositioned candidate drugs for Alzheimer's disease (AD), a disease affecting over 5.8 million people in the United States. We evaluated the clinical efficiency of top repositioned drug candidates using EHRs of over 72 million patients. The area under the receiver operating characteristic (ROC) curve was 0.97 in the de novo cross-validation when evaluated using 910 drugs. TargetPredict outperformed a state-of-the-art phenome-driven DTI prediction system as measured by precision-recall curves [measured by average precision (MAP): 0.28 versus 0.23, P-value < 0.0001]. The EHR-based case-control studies identified that the prescriptions top-ranked repositioned drugs are significantly associated with lower odds of AD diagnosis. For example, we showed that the prescription of liraglutide, a type 2 diabetes drug, is significantly associated with decreased risk of AD diagnosis [adjusted odds ratios (AORs): 0.76; 95% confidence intervals (CI) (0.70, 0.82), P-value < 0.0001]. In summary, our integrated approach that seamlessly combines computational DTI prediction and large-scale patients' EHRs-based clinical corroboration has high potential in rapidly identifying novel drug targets and drug candidates for complex diseases. AVAILABILITY AND IMPLEMENTATION:
Published on July 1, 2020

Network-principled deep generative models for designing drug combinations as graph sets.

Authors: Karimi M, Hasanzadeh A, Shen Y

Abstract: MOTIVATION: Combination therapy has shown to improve therapeutic efficacy while reducing side effects. Importantly, it has become an indispensable strategy to overcome resistance in antibiotics, antimicrobials and anticancer drugs. Facing enormous chemical space and unclear design principles for small-molecule combinations, computational drug-combination design has not seen generative models to meet its potential to accelerate resistance-overcoming drug combination discovery. RESULTS: We have developed the first deep generative model for drug combination design, by jointly embedding graph-structured domain knowledge and iteratively training a reinforcement learning-based chemical graph-set designer. First, we have developed hierarchical variational graph auto-encoders trained end-to-end to jointly embed gene-gene, gene-disease and disease-disease networks. Novel attentional pooling is introduced here for learning disease representations from associated genes' representations. Second, targeting diseases in learned representations, we have recast the drug-combination design problem as graph-set generation and developed a deep learning-based model with novel rewards. Specifically, besides chemical validity rewards, we have introduced novel generative adversarial award, being generalized sliced Wasserstein, for chemically diverse molecules with distributions similar to known drugs. We have also designed a network principle-based reward for disease-specific drug combinations. Numerical results indicate that, compared to state-of-the-art graph embedding methods, hierarchical variational graph auto-encoder learns more informative and generalizable disease representations. Results also show that the deep generative models generate drug combinations following the principle across diseases. Case studies on four diseases show that network-principled drug combinations tend to have low toxicity. The generated drug combinations collectively cover the disease module similar to FDA-approved drug combinations and could potentially suggest novel systems pharmacology strategies. Our method allows for examining and following network-based principle or hypothesis to efficiently generate disease-specific drug combinations in a vast chemical combinatorial space. AVAILABILITY AND IMPLEMENTATION: SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.