Publications Search
Explore how scientists all over the world use DrugBank in their research.
Published in 2015
READ PUBLICATION →

A comparison of conditional random fields and structured support vector machines for chemical entity recognition in biomedical literature.

Authors: Tang B, Feng Y, Wang X, Wu Y, Zhang Y, Jiang M, Wang J, Xu H

Abstract: BACKGROUND: Chemical compounds and drugs (together called chemical entities) embedded in scientific articles are crucial for many information extraction tasks in the biomedical domain. However, only a very limited number of chemical entity recognition systems are publically available, probably due to the lack of large manually annotated corpora. To accelerate the development of chemical entity recognition systems, the Spanish National Cancer Research Center (CNIO) and The University of Navarra organized a challenge on Chemical and Drug Named Entity Recognition (CHEMDNER). The CHEMDNER challenge contains two individual subtasks: 1) Chemical Entity Mention recognition (CEM); and 2) Chemical Document Indexing (CDI). Our study proposes machine learning-based systems for the CEM task. METHODS: The 2013 CHEMDNER challenge organizers provided a manually annotated 10,000 UTF8-encoded PubMed abstracts according to a predefined annotation guideline: a training set of 3,500 abstracts, a development set of 3,500 abstracts and a test set of 3,000 abstracts. We developed machine learning-based systems, based on conditional random fields (CRF) and structured support vector machines (SSVM) respectively, for the CEM task for this data set. The effects of three types of word representation (WR) features, generated by Brown clustering, random indexing and skip-gram, on both two machine learning-based systems were also investigated. The performance of our system was evaluated on the test set using scripts provided by the CHEMDNER challenge organizers. Primary evaluation measures were micro Precision, Recall, and F-measure. RESULTS: Our best system was among the top ranked systems with an official micro F-measure of 85.05%. Fixing a bug caused by inconsistent features marginally improved the performance (micro F-measure of 85.20%) of the system. CONCLUSIONS: The SSVM-based CEM systems outperformed the CRF-based CEM systems when using the same features. Each type of the WR feature was beneficial to the CEM task. Both the CRF-based and SSVM-based systems using the all three types of WR features showed better performance than the systems using only one type of the WR feature.
Published in 2015
READ PUBLICATION →

Analysis of pharmacogenomic variants associated with population differentiation.

Authors: Yeon B, Ahn E, Kim KI, Kim IW, Oh JM, Park T

Abstract: In the present study, we systematically investigated population differentiation of drug-related (DR) genes in order to identify common genetic features underlying population-specific responses to drugs. To do so, we used the International HapMap project release 27 Data and Pharmacogenomics Knowledge Base (PharmGKB) database. First, we compared four measures for assessing population differentiation: the chi-square test, the analysis of variance (ANOVA) F-test, Fst, and Nearest Shrunken Centroid Method (NSCM). Fst showed high sensitivity with stable specificity among varying sample sizes; thus, we selected Fst for determining population differentiation. Second, we divided DR genes from PharmGKB into two groups based on the degree of population differentiation as assessed by Fst: genes with a high level of differentiation (HD gene group) and genes with a low level of differentiation (LD gene group). Last, we conducted a gene ontology (GO) analysis and pathway analysis. Using all genes in the human genome as the background, the GO analysis and pathway analysis of the HD genes identified terms related to cell communication. "Cell communication" and "cell-cell signaling" had the lowest Benjamini-Hochberg's q-values (0.0002 and 0.0006, respectively), and "drug binding" was highly enriched (16.51) despite its relatively high q-value (0.0142). Among the 17 genes related to cell communication identified in the HD gene group, five genes (STX4, PPARD, DCK, GRIK4, and DRD3) contained single nucleotide polymorphisms with Fst values greater than 0.5. Specifically, the Fst values for rs10871454, rs6922548, rs3775289, rs1954787, and rs167771 were 0.682, 0.620, 0.573, 0.531, and 0.510, respectively. In the analysis using DR genes as the background, the HD gene group contained six significant terms. Five were related to reproduction, and one was "Wnt signaling pathway," which has been implicated in cancer. Our analysis suggests that the HD gene group from PharmGKB is associated with cell communication and drug binding.
Published in 2015
READ PUBLICATION →

ONRLDB--manually curated database of experimentally validated ligands for orphan nuclear receptors: insights into new drug discovery.

Authors: Nanduri R, Bhutani I, Somavarapu AK, Mahajan S, Parkesh R, Gupta P

Abstract: Orphan nuclear receptors are potential therapeutic targets. The Orphan Nuclear Receptor Ligand Binding Database (ONRLDB) is an interactive, comprehensive and manually curated database of small molecule ligands targeting orphan nuclear receptors. Currently, ONRLDB consists of approximately 11,000 ligands, of which approximately 6500 are unique. All entries include information for the ligand, such as EC50 and IC50, number of aromatic rings and rotatable bonds, XlogP, hydrogen donor and acceptor count, molecular weight (MW) and structure. ONRLDB is a cross-platform database, where either the cognate small molecule modulators of a receptor or the cognate receptors to a ligand can be searched. The database can be searched using three methods: text search, advanced search or similarity search. Substructure search, cataloguing tools, and clustering tools can be used to perform advanced analysis of the ligand based on chemical similarity fingerprints, hierarchical clustering, binning partition and multidimensional scaling. These tools, together with the Tree function provided, deliver an interactive platform and a comprehensive resource for identification of common and unique scaffolds. As demonstrated, ONRLDB is designed to allow selection of ligands based on various properties and for designing novel ligands or to improve the existing ones. Database URL: http://www.onrldb.org/.
Published in 2015
READ PUBLICATION →

Demonstration of Therapeutic Equivalence of Fluconazole Generic Products in the Neutropenic Mouse Model of Disseminated Candidiasis.

Authors: Gonzalez JM, Rodriguez CA, Zuluaga AF, Agudelo M, Vesga O

Abstract: Some generics of antibacterials fail therapeutic equivalence despite being pharmaceutical equivalents of their innovators, but data are scarce with antifungals. We used the neutropenic mice model of disseminated candidiasis to challenge the therapeutic equivalence of three generic products of fluconazole compared with the innovator in terms of concentration of the active pharmaceutical ingredient, analytical chemistry (liquid chromatography/mass spectrometry), in vitro susceptibility testing, single-dose serum pharmacokinetics in infected mice, and in vivo pharmacodynamics. Neutropenic, five week-old, murine pathogen free male mice of the strain Udea:ICR(CD-2) were injected in the tail vein with Candida albicans GRP-0144 (MIC = 0.25 mg/L) or Candida albicans CIB-19177 (MIC = 4 mg/L). Subcutaneous therapy with fluconazole (generics or innovator) and sterile saline (untreated controls) started 2 h after infection and ended 24 h later, with doses ranging from no effect to maximal effect (1 to 128 mg/kg per day) divided every 3 or 6 hours. The Hill's model was fitted to the data by nonlinear regression, and results from each group compared by curve fitting analysis. All products were identical in terms of concentration, chromatographic and spectrographic profiles, MICs, mouse pharmacokinetics, and in vivo pharmacodynamic parameters. In conclusion, the generic products studied were pharmaceutically and therapeutically equivalent to the innovator of fluconazole.
Published in 2015
READ PUBLICATION →

Physicochemical characteristics of structurally determined metabolite-protein and drug-protein binding events with respect to binding specificity.

Authors: Korkuc P, Walther D

Abstract: To better understand and ultimately predict both the metabolic activities as well as the signaling functions of metabolites, a detailed understanding of the physical interactions of metabolites with proteins is highly desirable. Focusing in particular on protein binding specificity vs. promiscuity, we performed a comprehensive analysis of the physicochemical properties of compound-protein binding events as reported in the Protein Data Bank (PDB). We compared the molecular and structural characteristics obtained for metabolites to those of the well-studied interactions of drug compounds with proteins. Promiscuously binding metabolites and drugs are characterized by low molecular weight and high structural flexibility. Unlike reported for drug compounds, low rather than high hydrophobicity appears associated, albeit weakly, with promiscuous binding for the metabolite set investigated in this study. Across several physicochemical properties, drug compounds exhibit characteristic binding propensities that are distinguishable from those associated with metabolites. Prediction of target diversity and compound promiscuity using physicochemical properties was possible at modest accuracy levels only, but was consistently better for drugs than for metabolites. Compound properties capturing structural flexibility and hydrogen-bond formation descriptors proved most informative in PLS-based prediction models. With regard to diversity of enzymatic activities of the respective metabolite target enzymes, the metabolites benzylsuccinate, hypoxanthine, trimethylamine N-oxide, oleoylglycerol, and resorcinol showed very narrow process involvement, while glycine, imidazole, tryptophan, succinate, and glutathione were identified to possess broad enzymatic reaction scopes. Promiscuous metabolites were found to mainly serve as general energy currency compounds, but were identified to also be involved in signaling processes and to appear in diverse organismal systems (digestive and nervous system) suggesting specific molecular and physiological roles of promiscuous metabolites.
Published in 2015
READ PUBLICATION →

Optimizing drug-target interaction prediction based on random walk on heterogeneous networks.

Authors: Seal A, Ahn YY, Wild DJ

Abstract: BACKGROUND: Predicting novel drug-target associations is important not only for developing new drugs, but also for furthering biological knowledge by understanding how drugs work and their modes of action. As more data about drugs, targets, and their interactions becomes available, computational approaches have become an indispensible part of drug target association discovery. In this paper we apply random walk with restart (RWR) method to a heterogeneous network of drugs and targets compiled from DrugBank database and investigate the performance of the method under parameter variation and choice of chemical fingerprint methods. RESULTS: We show that choice of chemical fingerprint does not affect the performance of the method when the parameters are tuned to optimal values. We use a subset of the ChEMBL15 dataset that contains 2,763 associations between 544 drugs and 467 target proteins to evaluate our method, and we extracted datasets of bioactivity =1 and =10 muM activity cutoff. For 1 muM bioactivity cutoff, we find that our method can correctly predict nearly 47, 55, 60% of the given drug-target interactions in the test dataset having more than 0, 1, 2 drug target relations for ChEMBL 1 muM dataset in top 50 rank positions. For 10 muM bioactivity cutoff, we find that our method can correctly predict nearly 32.4, 34.8, 35.3% of the given drug-target interactions in the test dataset having more than 0, 1, 2 drug target relations for ChEMBL 1 muM dataset in top 50 rank positions. We further examine the associations between 110 popular top selling drugs in 2012 and 3,519 targets and find the top ten targets for each drug. CONCLUSIONS: We demonstrate the effectiveness and promise of the approach-RWR on heterogeneous networks using chemical features-for identifying novel drug target interactions and investigate the performance.
Published in 2015
READ PUBLICATION →

Cancer based pharmacogenomics network supported with scientific evidences: from the view of drug repurposing.

Authors: Wang L, Liu H, Chute CG, Zhu Q

Abstract: BACKGROUND: Pharmacogenomics (PGx) as an emerging field, is poised to change the way we practice medicine and deliver health care by customizing drug therapies on the basis of each patient's genetic makeup. A large volume of PGx data including information among drugs, genes, and single nucleotide polymorphisms (SNPs) has been accumulated. Normalized and integrated PGx information could facilitate revelation of hidden relationships among drug treatments, genomic variations, and phenotype traits to better support drug discovery and next generation of treatment. METHODS: In this study, we generated a normalized and scientific evidence supported cancer based PGx network (CPN) by integrating cancer related PGx information from multiple well-known PGx resources including the Pharmacogenomics Knowledge Base (PharmGKB), the FDA PGx Biomarkers in Drug Labeling, and the Catalog of Published Genome-Wide Association Studies (GWAS). We successfully demonstrated the capability of the CPN for drug repurposing by conducting two case studies. CONCLUSIONS: The CPN established in this study offers comprehensive cancer based PGx information to support cancer orientated research, especially for drug repurposing.
Published in 2015
READ PUBLICATION →

Computational drug repositioning for peripheral arterial disease: prediction of anti-inflammatory and pro-angiogenic therapeutics.

Authors: Chu LH, Annex BH, Popel AS

Abstract: Peripheral arterial disease (PAD) results from atherosclerosis that leads to blocked arteries and reduced blood flow, most commonly in the arteries of the legs. PAD clinical trials to induce angiogenesis to improve blood flow conducted in the last decade have not succeeded. We have recently constructed PADPIN, protein-protein interaction network (PIN) of PAD, and here we combine it with the drug-target relations to identify potential drug targets for PAD. Specifically, the proteins in the PADPIN were classified as belonging to the angiome, immunome, and arteriome, characterizing the processes of angiogenesis, immune response/inflammation, and arteriogenesis, respectively. Using the network-based approach we predict the candidate drugs for repositioning that have potential applications to PAD. By compiling the drug information in two drug databases DrugBank and PharmGKB, we predict FDA-approved drugs whose targets are the proteins annotated as anti-angiogenic and pro-inflammatory, respectively. Examples of pro-angiogenic drugs are carvedilol and urokinase. Examples of anti-inflammatory drugs are ACE inhibitors and maraviroc. This is the first computational drug repositioning study for PAD.
Published in 2015
READ PUBLICATION →

Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature.

Authors: Rastegar-Mojarad M, Komandur Elayavilli R, Li D, Liu H

Abstract: Relation extraction typically involves the extraction of relations between two or more entities occurring within a single or multiple sentences. In this study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5%) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75% (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need of discourse-level analysis in extracting the relations from biomedical literature.
Published in 2015
READ PUBLICATION →

A functional biological network centered on XRCC3: a new possible marker of chemoradiotherapy resistance in rectal cancer patients.

Authors: Agostini M, Zangrando A, Pastrello C, D'Angelo E, Romano G, Giovannoni R, Giordan M, Maretto I, Bedin C, Zanon C, Digito M, Esposito G, Mescoli C, Lavitrano M, Rizzolio F, Jurisica I, Giordano A, Pucciarelli S, Nitti D

Abstract: Preoperative chemoradiotherapy is widely used to improve local control of disease, sphincter preservation and to improve survival in patients with locally advanced rectal cancer. Patients enrolled in the present study underwent preoperative chemoradiotherapy, followed by surgical excision. Response to chemoradiotherapy was evaluated according to Mandard's Tumor Regression Grade (TRG). TRG 3, 4 and 5 were considered as partial or no response while TRG 1 and 2 as complete response. From pretherapeutic biopsies of 84 locally advanced rectal carcinomas available for the analysis, only 42 of them showed 70% cancer cellularity at least. By determining gene expression profiles, responders and non-responders showed significantly different expression levels for 19 genes (P < 0.001). We fitted a logistic model selected with a stepwise procedure optimizing the Akaike Information Criterion (AIC) and then validated by means of leave one out cross validation (LOOCV, accuracy = 95%). Four genes were retained in the achieved model: ZNF160, XRCC3, HFM1 and ASXL2. Real time PCR confirmed that XRCC3 is overexpressed in responders group and HFM1 and ASXL2 showed a positive trend. In vitro test on colon cancer resistant/susceptible to chemoradioterapy cells, finally prove that XRCC3 deregulation is extensively involved in the chemoresistance mechanisms. Protein-protein interactions (PPI) analysis involving the predictive classifier revealed a network of 45 interacting nodes (proteins) with TRAF6 gene playing a keystone role in the network. The present study confirmed the possibility that gene expression profiling combined with integrative computational biology is useful to predict complete responses to preoperative chemoradiotherapy in patients with advanced rectal cancer.