Publications Search
Explore how scientists all over the world use DrugBank in their research.
Published on April 13, 2017
READ PUBLICATION →

Predicting neurological Adverse Drug Reactions based on biological, chemical and phenotypic properties of drugs using machine learning models.

Authors: Jamal S, Goyal S, Shanker A, Grover A

Abstract: Adverse drug reactions (ADRs) have become one of the primary reasons for the failure of drugs and a leading cause of deaths. Owing to the severe effects of ADRs, there is an urgent need for the generation of effective models which can accurately predict ADRs during early stages of drug development based on integration of various features of drugs. In the current study, we have focused on neurological ADRs and have used various properties of drugs that include biological properties (targets, transporters and enzymes), chemical properties (substructure fingerprints), phenotypic properties (side effects (SE) and therapeutic indications) and a combinations of the two and three levels of features. We employed relief-based feature selection technique to identify relevant properties and used machine learning approach to generated learned model systems which would predict neurological ADRs prior to preclinical testing. Additionally, in order to explain the efficiency and applicability of the models, we tested them to predict the ADRs for already existing anti-Alzheimer drugs and uncharacterized drugs, respectively in side effect resource (SIDER) database. The generated models were highly accurate and our results showed that the models based on chemical (accuracy 93.20%), phenotypic (accuracy 92.41%) and combination of three properties (accuracy 94.18%) were highly accurate while the models based on biological properties (accuracy 82.11%) were highly informative.
Published on April 7, 2017
READ PUBLICATION →

The Drug Repurposing Hub: a next-generation drug library and information resource.

Authors: Corsello SM, Bittker JA, Liu Z, Gould J, McCarren P, Hirschman JE, Johnston SE, Vrcic A, Wong B, Khan M, Asiedu J, Narayan R, Mader CC, Subramanian A, Golub TR

Abstract: 
Published on April 4, 2017
READ PUBLICATION →

VLDL/LDL acts as a drug carrier and regulates the transport and metabolism of drugs in the body.

Authors: Yamamoto H, Takada T, Yamanashi Y, Ogura M, Masuo Y, Harada-Shiba M, Suzuki H

Abstract: Only free drugs have been believed to be carried into tissues through active or passive transport. However, considering that lipoproteins function as carriers of serum lipids such as cholesterol and triglycerides, we hypothesized that lipoproteins can associate with certain drugs and mediate their transport into tissues in lipid-associated form. Here, in vitro and in vivo studies with low density lipoprotein receptor (LDLR)-overexpressing or -knockdown cells and wild-type or LDLR-mutant mice were used to show the association of various drugs with lipoproteins and the uptake of lipoprotein-associated drugs through a lipoprotein receptor-mediated process. In clinical studies, investigation of the effect of lipoprotein apheresis on serum drug concentrations in patients with familial hypercholesterolemia demonstrated that lipoprotein-mediated drug transport occurs in humans as well as in mice. These findings represent a new concept regarding the transport and metabolism of drugs in the body and suggest that the role of lipoprotein-mediated drug transport should be considered when developing effective and safe pharmacotherapies.
Published in March 2017
READ PUBLICATION →

Use of Graph Database for the Integration of Heterogeneous Biological Data.

Authors: Yoon BH, Kim SK, Kim SY

Abstract: Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data.
Published in March 2017
READ PUBLICATION →

CATTLE (CAncer treatment treasury with linked evidence): An integrated knowledge base for personalized oncology research and practice.

Authors: Soysal E, Lee HJ, Zhang Y, Huang LC, Chen X, Wei Q, Zheng W, Chang JT, Cohen T, Sun J, Xu H

Abstract: Despite the existence of various databases cataloging cancer drugs, there is an emerging need to support the development and application of personalized therapies, where an integrated understanding of the clinical factors and drug mechanism of action and its gene targets is necessary. We have developed CATTLE (CAncer Treatment Treasury with Linked Evidence), a comprehensive cancer drug knowledge base providing information across the complete spectrum of the drug life cycle. The CATTLE system collects relevant data from 22 heterogeneous databases, integrates them into a unified model centralized on drugs, and presents comprehensive drug information via an interactive web portal with a download function. A total of 2,323 unique cancer drugs are currently linked to rich information from these databases in CATTLE. Through two use cases, we demonstrate that CATTLE can be used in supporting both research and practice in personalized oncology.
Published in March 2017
READ PUBLICATION →

Computational chemistry at Janssen.

Authors: van Vlijmen H, Desjarlais RL, Mirzadegan T

Abstract: Computer-aided drug discovery activities at Janssen are carried out by scientists in the Computational Chemistry group of the Discovery Sciences organization. This perspective gives an overview of the organizational and operational structure, the science, internal and external collaborations, and the impact of the group on Drug Discovery at Janssen.
Published in March 2017
READ PUBLICATION →

Impact of germline and somatic missense variations on drug binding sites.

Authors: Yan C, Pattabiraman N, Goecks J, Lam P, Nayak A, Pan Y, Torcivia-Rodriguez J, Voskanian A, Wan Q, Mazumder R

Abstract: Advancements in next-generation sequencing (NGS) technologies are generating a vast amount of data. This exacerbates the current challenge of translating NGS data into actionable clinical interpretations. We have comprehensively combined germline and somatic nonsynonymous single-nucleotide variations (nsSNVs) that affect drug binding sites in order to investigate their prevalence. The integrated data thus generated in conjunction with exome or whole-genome sequencing can be used to identify patients who may not respond to a specific drug because of alterations in drug binding efficacy due to nsSNVs in the target protein's gene. To identify the nsSNVs that may affect drug binding, protein-drug complex structures were retrieved from Protein Data Bank (PDB) followed by identification of amino acids in the protein-drug binding sites using an occluded surface method. Then, the germline and somatic mutations were mapped to these amino acids to identify which of these alter protein-drug binding sites. Using this method we identified 12 993 amino acid-drug binding sites across 253 unique proteins bound to 235 unique drugs. The integration of amino acid-drug binding sites data with both germline and somatic nsSNVs data sets revealed 3133 nsSNVs affecting amino acid-drug binding sites. In addition, a comprehensive drug target discovery was conducted based on protein structure similarity and conservation of amino acid-drug binding sites. Using this method, 81 paralogs were identified that could serve as alternative drug targets. In addition, non-human mammalian proteins bound to drugs were used to identify 142 homologs in humans that can potentially bind to drugs. In the current protein-drug pairs that contain somatic mutations within their binding site, we identified 85 proteins with significant differential gene expression changes associated with specific cancer types. Information on protein-drug binding predicted drug target proteins and prevalence of both somatic and germline nsSNVs that disrupt these binding sites can provide valuable knowledge for personalized medicine treatment. A web portal is available where nsSNVs from individual patient can be checked by scanning against DrugVar to determine whether any of the SNVs affect the binding of any drug in the database.
Published in March 2017
READ PUBLICATION →

IODNE: An integrated optimization method for identifying the deregulated subnetwork for precision medicine in cancer.

Authors: Mounika Inavolu S, Renbarger J, Radovich M, Vasudevaraja V, Kinnebrew GH, Zhang S, Cheng L

Abstract: Subnetwork analysis can explore complex patterns of entire molecular pathways for the purpose of drug target identification. In this article, the gene expression profiles of a cohort of patients with breast cancer are integrated with protein-protein interaction (PPI) networks using, simultaneously, both edge scoring and node scoring. A novel optimization algorithm, integrated optimization method to identify deregulated subnetwork (IODNE), is developed to search for the optimal dysregulated subnetwork of the merged gene and protein network. IODNE is applied to select subnetworks for Luminal-A breast cancer from The Cancer Genome Atlas (TCGA) data. A large fraction of cancer-related genes and the well-known clinical targets, ER1/PR and HER2, are found by IODNE. This validates the utility of IODNE. When applying IODNE to the triple-negative breast cancer (TNBC) subtype data, we identified subnetworks that contain genes such as ERBB2, HRAS, PGR, CAD, POLE, and SLC2A1.
Published in March 2017
READ PUBLICATION →

Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA.

Authors: Biggs MB, Papin JA

Abstract: Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository.
Published on March 30, 2017
READ PUBLICATION →

HIVed, a knowledgebase for differentially expressed human genes and proteins during HIV infection, replication and latency.

Authors: Li C, Ramarathinam SH, Revote J, Khoury G, Song J, Purcell AW

Abstract: Measuring the altered gene expression level and identifying differentially expressed genes/proteins during HIV infection, replication and latency is fundamental for broadening our understanding of the mechanisms of HIV infection and T-cell dysfunction. Such studies are crucial for developing effective strategies for virus eradication from the body. Inspired by the availability and enrichment of gene expression data during HIV infection, replication and latency, in this study, we proposed a novel compendium termed HIVed (HIV expression database; http://hivlatency.erc.monash.edu/) that harbours comprehensive functional annotations of proteins, whose genes have been shown to be dysregulated during HIV infection, replication and latency using different experimental designs and measurements. We manually curated a variety of third-party databases for structural and functional annotations of the protein entries in HIVed. With the goal of benefiting HIV related research, we collected a number of biological annotations for all the entries in HIVed besides their expression profile, including basic protein information, Gene Ontology terms, secondary structure, HIV-1 interaction and pathway information. We hope this comprehensive protein-centric knowledgebase can bridge the gap between the understanding of differentially expressed genes and the functions of their protein products, facilitating the generation of novel hypotheses and treatment strategies to fight against the HIV pandemic.