Publications Search
Explore how scientists all over the world use DrugBank in their research.
Published in 2013
READ PUBLICATION →

Gathering and exploring scientific knowledge in pharmacovigilance.

Authors: Lopes P, Nunes T, Campos D, Furlong LI, Bauer-Mehren A, Sanz F, Carrascosa MC, Mestres J, Kors J, Singh B, van Mulligen E, Van der Lei J, Diallo G, Avillach P, Ahlberg E, Boyer S, Diaz C, Oliveira JL

Abstract: Pharmacovigilance plays a key role in the healthcare domain through the assessment, monitoring and discovery of interactions amongst drugs and their effects in the human organism. However, technological advances in this field have been slowing down over the last decade due to miscellaneous legal, ethical and methodological constraints. Pharmaceutical companies started to realize that collaborative and integrative approaches boost current drug research and development processes. Hence, new strategies are required to connect researchers, datasets, biomedical knowledge and analysis algorithms, allowing them to fully exploit the true value behind state-of-the-art pharmacovigilance efforts. This manuscript introduces a new platform directed towards pharmacovigilance knowledge providers. This system, based on a service-oriented architecture, adopts a plugin-based approach to solve fundamental pharmacovigilance software challenges. With the wealth of collected clinical and pharmaceutical data, it is now possible to connect knowledge providers' analysis and exploration algorithms with real data. As a result, new strategies allow a faster identification of high-risk interactions between marketed drugs and adverse events, and enable the automated uncovering of scientific evidence behind them. With this architecture, the pharmacovigilance field has a new platform to coordinate large-scale drug evaluation efforts in a unique ecosystem, publicly available at http://bioinformatics.ua.pt/euadr/.
Published in December 2013
READ PUBLICATION →

PREDOSE: a semantic web platform for drug abuse epidemiology using social media.

Authors: Cameron D, Smith GA, Daniulaityte R, Sheth AP, Dave D, Chen L, Anand G, Carlson R, Watkins KZ, Falck R

Abstract: OBJECTIVES: The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel semantic web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO--pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC), through combination of lexical, pattern-based and semantics-based techniques. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. METHODS: Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, and routes of administration. The DAO is also used to help recognize three types of data, namely: (1) entities, (2) relationships and (3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information, which facilitate search, trend analysis and overall content analysis using social media on prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. RESULTS: A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. CONCLUSION: A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future.
Published in 2013
READ PUBLICATION →

Predicting Drug-Target Interactions for New Drug Compounds Using a Weighted Nearest Neighbor Profile.

Authors: van Laarhoven T, Marchiori E

Abstract: In silico discovery of interactions between drug compounds and target proteins is of core importance for improving the efficiency of the laborious and costly experimental determination of drug-target interaction. Drug-target interaction data are available for many classes of pharmaceutically useful target proteins including enzymes, ion channels, GPCRs and nuclear receptors. However, current drug-target interaction databases contain a small number of drug-target pairs which are experimentally validated interactions. In particular, for some drug compounds (or targets) there is no available interaction. This motivates the need for developing methods that predict interacting pairs with high accuracy also for these 'new' drug compounds (or targets). We show that a simple weighted nearest neighbor procedure is highly effective for this task. We integrate this procedure into a recent machine learning method for drug-target interaction we developed in previous work. Results of experiments indicate that the resulting method predicts true interactions with high accuracy also for new drug compounds and achieves results comparable or better than those of recent state-of-the-art algorithms. Software is publicly available at http://cs.ru.nl/~tvanlaarhoven/drugtarget2013/.
Published in 2013
READ PUBLICATION →

Compensating for literature annotation bias when predicting novel drug-disease relationships through Medical Subject Heading Over-representation Profile (MeSHOP) similarity.

Authors: Cheung WA, Ouellette BF, Wasserman WW

Abstract: BACKGROUND: Using annotations to the articles in MEDLINE(R)/PubMed(R), over six thousand chemical compounds with pharmacological actions have been tracked since 1996. Medical Subject Heading Over-representation Profiles (MeSHOPs) quantitatively leverage the literature associated with biological entities such as diseases or drugs, providing the opportunity to reposition known compounds towards novel disease applications. METHODS: A MeSHOP is constructed by counting the number of times each medical subject term is assigned to an entity-related research publication in the MEDLINE database and calculating the significance of the count by comparing against the count of the term in a background set of publications. Based on the expectation that drugs suitable for treatment of a disease (or disease symptom) will have similar annotation properties to the disease, we successfully predict drug-disease associations by comparing MeSHOPs of diseases and drugs. RESULTS: The MeSHOP comparison approach delivers an 11% improvement over bibliometric baselines. However, novel drug-disease associations are observed to be biased towards drugs and diseases with more publications. To account for the annotation biases, a correction procedure is introduced and evaluated. CONCLUSIONS: By explicitly accounting for the annotation bias, unexpectedly similar drug-disease pairs are highlighted as candidates for drug repositioning research. MeSHOPs are shown to provide a literature-supported perspective for discovery of new links between drugs and diseases based on pre-existing knowledge.
Published in 2013
READ PUBLICATION →

Systematic identification of proteins that elicit drug side effects.

Authors: Kuhn M, Al Banchaabouchi M, Campillos M, Jensen LJ, Gross C, Gavin AC, Bork P

Abstract: Side effect similarities of drugs have recently been employed to predict new drug targets, and networks of side effects and targets have been used to better understand the mechanism of action of drugs. Here, we report a large-scale analysis to systematically predict and characterize proteins that cause drug side effects. We integrated phenotypic data obtained during clinical trials with known drug-target relations to identify overrepresented protein-side effect combinations. Using independent data, we confirm that most of these overrepresentations point to proteins which, when perturbed, cause side effects. Of 1428 side effects studied, 732 were predicted to be predominantly caused by individual proteins, at least 137 of them backed by existing pharmacological or phenotypic data. We prove this concept in vivo by confirming our prediction that activation of the serotonin 7 receptor (HTR7) is responsible for hyperesthesia in mice, which, in turn, can be prevented by a drug that selectively inhibits HTR7. Taken together, we show that a large fraction of complex drug side effects are mediated by individual proteins and create a reference for such relations.
Published in 2013
READ PUBLICATION →

Remodeling the proteostasis network to rescue glucocerebrosidase variants by inhibiting ER-associated degradation and enhancing ER folding.

Authors: Wang F, Segatori L

Abstract: Gaucher's disease (GD) is characterized by loss of lysosomal glucocerebrosidase (GC) activity. Mutations in the gene encoding GC destabilize the protein's native folding leading to ER-associated degradation (ERAD) of the misfolded enzyme. Enhancing the cellular folding capacity by remodeling the proteostasis network promotes native folding and lysosomal activity of mutated GC variants. However, proteostasis modulators reported so far, including ERAD inhibitors, trigger cellular stress and lead to induction of apoptosis. We show herein that lacidipine, an L-type Ca(2+) channel blocker that also inhibits ryanodine receptors on the ER membrane, enhances folding, trafficking and lysosomal activity of the most severely destabilized GC variant achieved via ERAD inhibition in fibroblasts derived from patients with GD. Interestingly, reprogramming the proteostasis network by combining modulation of Ca(2+) homeostasis and ERAD inhibition remodels the unfolded protein response and dramatically lowers apoptosis induction typically associated with ERAD inhibition.
Published in 2013
READ PUBLICATION →

Using empirically constructed lexical resources for named entity recognition.

Authors: Jonnalagadda S, Cohen T, Wu S, Liu H, Gonzalez G

Abstract: Because of privacy concerns and the expense involved in creating an annotated corpus, the existing small-annotated corpora might not have sufficient examples for learning to statistically extract all the named-entities precisely. In this work, we evaluate what value may lie in automatically generated features based on distributional semantics when using machine-learning named entity recognition (NER). The features we generated and experimented with include n-nearest words, support vector machine (SVM)-regions, and term clustering, all of which are considered distributional semantic features. The addition of the n-nearest words feature resulted in a greater increase in F-score than by using a manually constructed lexicon to a baseline system. Although the need for relatively small-annotated corpora for retraining is not obviated, lexicons empirically derived from unannotated text can not only supplement manually created lexicons, but also replace them. This phenomenon is observed in extracting concepts from both biomedical literature and clinical notes.
Published in 2013
READ PUBLICATION →

Chemical structure identification in metabolomics: computational modeling of experimental features.

Authors: Menikarachchi LC, Hamdalla MA, Hill DW, Grant DF

Abstract: The identification of compounds in complex mixtures remains challenging despite recent advances in analytical techniques. At present, no single method can detect and quantify the vast array of compounds that might be of potential interest in metabolomics studies. High performance liquid chromatography/mass spectrometry (HPLC/MS) is often considered the analytical method of choice for analysis of biofluids. The positive identification of an unknown involves matching at least two orthogonal HPLC/MS measurements (exact mass, retention index, drift time etc.) against an authentic standard. However, due to the limited availability of authentic standards, an alternative approach involves matching known and measured features of the unknown compound with computationally predicted features for a set of candidate compounds downloaded from a chemical database. Computationally predicted features include retention index, ECOM50 (energy required to decompose 50% of a selected precursor ion in a collision induced dissociation cell), drift time, whether the unknown compound is biological or synthetic and a collision induced dissociation (CID) spectrum. Computational predictions are used to filter the initial "bin" of candidate compounds. The final output is a ranked list of candidates that best match the known and measured features. In this mini review, we discuss cheminformatics methods underlying this database search-filter identification approach.
Published in 2013
READ PUBLICATION →

KCF-S: KEGG Chemical Function and Substructure for improved interpretability and prediction in chemical bioinformatics.

Authors: Kotera M, Tabei Y, Yamanishi Y, Moriya Y, Tokimatsu T, Kanehisa M, Goto S

Abstract: BACKGROUND: In order to develop hypothesis on unknown metabolic pathways, biochemists frequently rely on literature that uses a free-text format to describe functional groups or substructures. In computational chemistry or cheminformatics, molecules are typically represented by chemical descriptors, i.e., vectors that summarize information on its various properties. However, it is difficult to interpret these chemical descriptors since they are not directly linked to the terminology of functional groups or substructures that the biochemists use. METHODS: In this study, we used KEGG Chemical Function (KCF) format to computationally describe biochemical substructures in seven attributes that resemble biochemists' way of dealing with substructures. RESULTS: We established KCF-S (KCF-and-Substructures) format as an additional structural information of KCF. Applying KCF-S revealed the specific appearance of substructures from various datasets of molecules that describes the characteristics of the respective datasets. Structure-based clustering of molecules using KCF-S resulted the clusters in which molecular weights and structures were less diverse than those obtained by conventional chemical fingerprints. We further applied KCF-S to find the pairs of molecules that are possibly converted to each other in enzymatic reactions, and KCF-S clearly improved predictive performance than that presented previously. CONCLUSIONS: KCF-S defines biochemical substructures with keeping interpretability, suggesting the potential to apply more studies on chemical bioinformatics. KCF and KCF-S can be automatically converted from Molfile format, enabling to deal with molecules from any data sources.
Published in 2013
READ PUBLICATION →

Analysis of schizophrenia and hepatocellular carcinoma genetic network with corresponding modularity and pathways: novel insights to the immune system.

Authors: Huang KC, Yang KC, Lin H, Tsao Tsun-Hui T, Lee WK, Lee SA, Kao CY

Abstract: BACKGROUND: Schizophrenic patients show lower incidences of cancer, implicating schizophrenia may be a protective factor against cancer. To study the genetic correlation between the two diseases, a specific PPI network was constructed with candidate genes of both schizophrenia and hepatocellular carcinoma. The network, designated schizophrenia-hepatocellular carcinoma network (SHCN), was analysed and cliques were identified as potential functional modules or complexes. The findings were compared with information from pathway databases such as KEGG, Reactome, PID and ConsensusPathDB. RESULTS: The functions of mediator genes from SHCN show immune system and cell cycle regulation have important roles in the eitology mechanism of schizophrenia. For example, the over-expressing schizophrenia candidate genes, SIRPB1, SYK and LCK, are responsible for signal transduction in cytokine production; immune responses involving IL-2 and TREM-1/DAP12 pathways are relevant for the etiology mechanism of schizophrenia. Novel treatments were proposed by searching the target genes of FDA approved drugs with genes in potential protein complexes and pathways. It was found that Vitamin A, retinoid acid and a few other immune response agents modulated by RARA and LCK genes may be potential treatments for both schizophrenia and hepatocellular carcinoma. CONCLUSIONS: This is the first study showing specific mediator genes in the SHCN which may suppress tumors. We also show that the schizophrenic protein interactions and modulation with cancer implicates the importance of immune system for etiology of schizophrenia.