Publications Search
Explore how scientists all over the world use DrugBank in their research.
Published in March - April 2014
READ PUBLICATION →

Drug repurposing: mining protozoan proteomes for targets of known bioactive compounds.

Authors: Sateriale A, Bessoff K, Sarkar IN, Huston CD

Abstract: OBJECTIVE: To identify potential opportunities for drug repurposing by developing an automated approach to pre-screen the predicted proteomes of any organism against databases of known drug targets using only freely available resources. MATERIALS AND METHODS: We employed a combination of Ruby scripts that leverage data from the DrugBank and ChEMBL databases, MySQL, and BLAST to predict potential drugs and their targets from 13 published genomes. Results from a previous cell-based screen to identify inhibitors of Cryptosporidium parvum growth were used to validate our in-silico prediction method. RESULTS: In-vitro validation of these results, using a cell-based C parvum growth assay, showed that the predicted inhibitors were significantly more likely than expected by chance to have confirmed activity, with 8.9-15.6% of predicted inhibitors confirmed depending on the drug target database used. This method was then used to predict inhibitors for the following 13 disease-causing protozoan parasites, including: C parvum, Entamoeba histolytica, Giardia intestinalis, Leishmania braziliensis, Leishmania donovani, Leishmania major, Naegleria gruberi (in proxy of Naegleria fowleri), Plasmodium falciparum, Plasmodium vivax, Toxoplasma gondii, Trichomonas vaginalis, Trypanosoma brucei and Trypanosoma cruzi. CONCLUSIONS: Although proteome-wide screens for drug targets have disadvantages, in-silico methods can be developed that are fast, broad, inexpensive, and effective. In-vitro validation of our results for C parvum indicate that the method presented here can be used to construct a library for more directed small molecule screening, or pipelined into structural modeling and docking programs to facilitate target-based drug development.
Published in March - April 2014
READ PUBLICATION →

Mining clinical text for signals of adverse drug-drug interactions.

Authors: Iyer SV, Harpaz R, LePendu P, Bauer-Mehren A, Shah NH

Abstract: BACKGROUND AND OBJECTIVE: Electronic health records (EHRs) are increasingly being used to complement the FDA Adverse Event Reporting System (FAERS) and to enable active pharmacovigilance. Over 30% of all adverse drug reactions are caused by drug-drug interactions (DDIs) and result in significant morbidity every year, making their early identification vital. We present an approach for identifying DDI signals directly from the textual portion of EHRs. METHODS: We recognize mentions of drug and event concepts from over 50 million clinical notes from two sites to create a timeline of concept mentions for each patient. We then use adjusted disproportionality ratios to identify significant drug-drug-event associations among 1165 drugs and 14 adverse events. To validate our results, we evaluate our performance on a gold standard of 1698 DDIs curated from existing knowledge bases, as well as with signaling DDI associations directly from FAERS using established methods. RESULTS: Our method achieves good performance, as measured by our gold standard (area under the receiver operator characteristic (ROC) curve >80%), on two independent EHR datasets and the performance is comparable to that of signaling DDIs from FAERS. We demonstrate the utility of our method for early detection of DDIs and for identifying alternatives for risky drug combinations. Finally, we publish a first of its kind database of population event rates among patients on drug combinations based on an EHR corpus. CONCLUSIONS: It is feasible to identify DDI signals and estimate the rate of adverse events among patients on drug combinations, directly from clinical text; this could have utility in prioritizing drug interaction surveillance as well as in clinical decision support.
Published in March 2014
READ PUBLICATION →

Semantic Modeling for SNPs Associated with Ethnic Disparities in HapMap Samples.

Authors: Kim H, Yoo WG, Park J, Kim H, Kang BC

Abstract: Single-nucleotide polymorphisms (SNPs) have been emerging out of the efforts to research human diseases and ethnic disparities. A semantic network is needed for in-depth understanding of the impacts of SNPs, because phenotypes are modulated by complex networks, including biochemical and physiological pathways. We identified ethnicity-specific SNPs by eliminating overlapped SNPs from HapMap samples, and the ethnicity-specific SNPs were mapped to the UCSC RefGene lists. Ethnicity-specific genes were identified as follows: 22 genes in the USA (CEU) individuals, 25 genes in the Japanese (JPT) individuals, and 332 genes in the African (YRI) individuals. To analyze the biologically functional implications for ethnicity-specific SNPs, we focused on constructing a semantic network model. Entities for the network represented by "Gene," "Pathway," "Disease," "Chemical," "Drug," "ClinicalTrials," "SNP," and relationships between entity-entity were obtained through curation. Our semantic modeling for ethnicity-specific SNPs showed interesting results in the three categories, including three diseases ("AIDS-associated nephropathy," "Hypertension," and "Pelvic infection"), one drug ("Methylphenidate"), and five pathways ("Hemostasis," "Systemic lupus erythematosus," "Prostate cancer," "Hepatitis C virus," and "Rheumatoid arthritis"). We found ethnicity-specific genes using the semantic modeling, and the majority of our findings was consistent with the previous studies - that an understanding of genetic variability explained ethnicity-specific disparities.
Published in March 2014
READ PUBLICATION →

Multi-algorithm and multi-model based drug target prediction and web server.

Authors: Liu YT, Li Y, Huang ZF, Xu ZJ, Yang Z, Chen ZX, Chen KX, Shi JY, Zhu WL

Abstract: AIM: To develop a reliable computational approach for predicting potential drug targets based merely on protein sequence. METHODS: With drug target and non-target datasets prepared and 3 classification algorithms (Support Vector Machine, Neural Network and Decision Tree), a multi-algorithm and multi-model based strategy was employed for constructing models to predict potential drug targets. RESULTS: Twenty one prediction models for each of the 3 algorithms were successfully developed. Our evaluation results showed that approximately 30% of human proteins were potential drug targets, and approximately 40% of putative targets for the drugs undergoing phase II clinical trials were probably non-targets. A public web server named D3TPredictor (http://www.d3pharma.com/d3tpredictor) was constructed to provide easy access. CONCLUSION: Reliable and robust drug target prediction based on protein sequences is achieved using the multi-algorithm and multi-model strategy.
Published in March - April 2014
READ PUBLICATION →

Determining molecular predictors of adverse drug reactions with causality analysis based on structure learning.

Authors: Liu M, Cai R, Hu Y, Matheny ME, Sun J, Hu J, Xu H

Abstract: OBJECTIVE: Adverse drug reaction (ADR) can have dire consequences. However, our current understanding of the causes of drug-induced toxicity is still limited. Hence it is of paramount importance to determine molecular factors of adverse drug responses so that safer therapies can be designed. METHODS: We propose a causality analysis model based on structure learning (CASTLE) for identifying factors that contribute significantly to ADRs from an integration of chemical and biological properties of drugs. This study aims to address two major limitations of the existing ADR prediction studies. First, ADR prediction is mostly performed by assessing the correlations between the input features and ADRs, and the identified associations may not indicate causal relations. Second, most predictive models lack biological interpretability. RESULTS: CASTLE was evaluated in terms of prediction accuracy on 12 organ-specific ADRs using 830 approved drugs. The prediction was carried out by first extracting causal features with structure learning and then applying them to a support vector machine (SVM) for classification. Through rigorous experimental analyses, we observed significant increases in both macro and micro F1 scores compared with the traditional SVM classifier, from 0.88 to 0.89 and 0.74 to 0.81, respectively. Most importantly, identified links between the biological factors and organ-specific drug toxicities were partially supported by evidence in Online Mendelian Inheritance in Man. CONCLUSIONS: The proposed CASTLE model not only performed better in prediction than the baseline SVM but also produced more interpretable results (ie, biological factors responsible for ADRs), which is critical to discovering molecular activators of ADRs.
Published on March 21, 2014
READ PUBLICATION →

PeptiSite: a structural database of peptide binding sites in 4D.

Authors: Acharya C, Kufareva I, Ilatovskiy AV, Abagyan R

Abstract: We developed PeptiSite, a comprehensive and reliable database of biologically and structurally characterized peptide-binding sites, in which each site is represented by an ensemble of its complexes with protein, peptide and small molecule partners. The unique features of the database include: (1) the ensemble site representation that provides a fourth dimension to the otherwise three dimensional data, (2) comprehensive characterization of the binding site architecture that may consist of a multimeric protein assembly with cofactors and metal ions and (3) analysis of consensus interaction motifs within the ensembles and identification of conserved determinants of these interactions. Currently the database contains 585 proteins with 650 peptide-binding sites. http://peptisite.ucsd.edu/ link allows searching for the sites of interest and interactive visualization of the ensembles using the ActiveICM web-browser plugin. This structural database for protein-peptide interactions enables understanding of structural principles of these interactions and may assist the development of an efficient peptide docking benchmark.
Published on March 21, 2014
READ PUBLICATION →

The characteristic direction: a geometrical approach to identify differentially expressed genes.

Authors: Clark NR, Hu KS, Feldmann AS, Kou Y, Chen EY, Duan Q, Ma'ayan A

Abstract: BACKGROUND: Identifying differentially expressed genes (DEG) is a fundamental step in studies that perform genome wide expression profiling. Typically, DEG are identified by univariate approaches such as Significance Analysis of Microarrays (SAM) or Linear Models for Microarray Data (LIMMA) for processing cDNA microarrays, and differential gene expression analysis based on the negative binomial distribution (DESeq) or Empirical analysis of Digital Gene Expression data in R (edgeR) for RNA-seq profiling. RESULTS: Here we present a new geometrical multivariate approach to identify DEG called the Characteristic Direction. We demonstrate that the Characteristic Direction method is significantly more sensitive than existing methods for identifying DEG in the context of transcription factor (TF) and drug perturbation responses over a large number of microarray experiments. We also benchmarked the Characteristic Direction method using synthetic data, as well as RNA-Seq data. A large collection of microarray expression data from TF perturbations (73 experiments) and drug perturbations (130 experiments) extracted from the Gene Expression Omnibus (GEO), as well as an RNA-Seq study that profiled genome-wide gene expression and STAT3 DNA binding in two subtypes of diffuse large B-cell Lymphoma, were used for benchmarking the method using real data. ChIP-Seq data identifying DNA binding sites of the perturbed TFs, as well as known drug targets of the perturbing drugs, were used as prior knowledge silver-standard for validation. In all cases the Characteristic Direction DEG calling method outperformed other methods. We find that when drugs are applied to cells in various contexts, the proteins that interact with the drug-targets are differentially expressed and more of the corresponding genes are discovered by the Characteristic Direction method. In addition, we show that the Characteristic Direction conceptualization can be used to perform improved gene set enrichment analyses when compared with the gene-set enrichment analysis (GSEA) and the hypergeometric test. CONCLUSIONS: The application of the Characteristic Direction method may shed new light on relevant biological mechanisms that would have remained undiscovered by the current state-of-the-art DEG methods. The method is freely accessible via various open source code implementations using four popular programming languages: R, Python, MATLAB and Mathematica, all available at: http://www.maayanlab.net/CD.
Published on March 15, 2014
READ PUBLICATION →

The functional therapeutic chemical classification system.

Authors: Croset S, Overington JP, Rebholz-Schuhmann D

Abstract: MOTIVATION: Drug repositioning is the discovery of new indications for compounds that have already been approved and used in a clinical setting. Recently, some computational approaches have been suggested to unveil new opportunities in a systematic fashion, by taking into consideration gene expression signatures or chemical features for instance. We present here a novel method based on knowledge integration using semantic technologies, to capture the functional role of approved chemical compounds. RESULTS: In order to computationally generate repositioning hypotheses, we used the Web Ontology Language to formally define the semantics of over 20 000 terms with axioms to correctly denote various modes of action (MoA). Based on an integration of public data, we have automatically assigned over a thousand of approved drugs into these MoA categories. The resulting new resource is called the Functional Therapeutic Chemical Classification System and was further evaluated against the content of the traditional Anatomical Therapeutic Chemical Classification System. We illustrate how the new classification can be used to generate drug repurposing hypotheses, using Alzheimers disease as a use-case. AVAILABILITY: https://www.ebi.ac.uk/chembl/ftc; https://github.com/loopasam/ftc. CONTACT: croset@ebi.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Published on March 12, 2014
READ PUBLICATION →

VNP: Interactive Visual Network Pharmacology of Diseases, Targets, and Drugs.

Authors: Hu QN, Deng Z, Tu W, Yang X, Meng ZB, Deng ZX, Liu J

Abstract: In drug discovery, promiscuous targets, multifactorial diseases, and "dirty" drugs construct complex network relationships. Network pharmacology description and analysis not only give a systems-level understanding of drug action and disease complexity but can also help to improve the efficiency of target selection and drug design. Visual network pharmacology (VNP) is developed to visualize network pharmacology of targets, diseases, and drugs with a graph network by using disease, target or drug names, chemical structures, or protein sequence. To our knowledge, VNP is the first free interactive VNP server that should be very helpful for systems pharmacology research. VNP is freely available at http://cadd.whu.edu.cn/ditad/vnpsearch.
Published on March 11, 2014
READ PUBLICATION →

Drug2Gene: an exhaustive resource to explore effectively the drug-target relation network.

Authors: Roider HG, Pavlova N, Kirov I, Slavov S, Slavov T, Uzunov Z, Weiss B

Abstract: BACKGROUND: Information about drug-target relations is at the heart of drug discovery. There are now dozens of databases providing drug-target interaction data with varying scope, and focus. Therefore, and due to the large chemical space, the overlap of the different data sets is surprisingly small. As searching through these sources manually is cumbersome, time-consuming and error-prone, integrating all the data is highly desirable. Despite a few attempts, integration has been hampered by the diversity of descriptions of compounds, and by the fact that the reported activity values, coming from different data sets, are not always directly comparable due to usage of different metrics or data formats. DESCRIPTION: We have built Drug2Gene, a knowledge base, which combines the compound/drug-gene/protein information from 19 publicly available databases. A key feature is our rigorous unification and standardization process which makes the data truly comparable on a large scale, allowing for the first time effective data mining in such a large knowledge corpus. As of version 3.2, Drug2Gene contains 4,372,290 unified relations between compounds and their targets most of which include reported bioactivity data. We extend this set with putative (i.e. homology-inferred) relations where sufficient sequence homology between proteins suggests they may bind to similar compounds. Drug2Gene provides powerful search functionalities, very flexible export procedures, and a user-friendly web interface. CONCLUSIONS: Drug2Gene v3.2 has become a mature and comprehensive knowledge base providing unified, standardized drug-target related information gathered from publicly available data sources. It can be used to integrate proprietary data sets with publicly available data sets. Its main goal is to be a 'one-stop shop' to identify tool compounds targeting a given gene product or for finding all known targets of a drug. Drug2Gene with its integrated data set of public compound-target relations is freely accessible without restrictions at http://www.drug2gene.com.