Publications Search
Explore how scientists all over the world use DrugBank in their research.
Published on August 6, 2009
READ PUBLICATION →

Human disease-drug network based on genomic expression profiles.

Authors: Hu G, Agarwal P

Abstract: BACKGROUND: Drug repositioning offers the possibility of faster development times and reduced risks in drug discovery. With the rapid development of high-throughput technologies and ever-increasing accumulation of whole genome-level datasets, an increasing number of diseases and drugs can be comprehensively characterized by the changes they induce in gene expression, protein, metabolites and phenotypes. METHODOLOGY/PRINCIPAL FINDINGS: We performed a systematic, large-scale analysis of genomic expression profiles of human diseases and drugs to create a disease-drug network. A network of 170,027 significant interactions was extracted from the approximately 24.5 million comparisons between approximately 7,000 publicly available transcriptomic profiles. The network includes 645 disease-disease, 5,008 disease-drug, and 164,374 drug-drug relationships. At least 60% of the disease-disease pairs were in the same disease area as determined by the Medical Subject Headings (MeSH) disease classification tree. The remaining can drive a molecular level nosology by discovering relationships between seemingly unrelated diseases, such as a connection between bipolar disorder and hereditary spastic paraplegia, and a connection between actinic keratosis and cancer. Among the 5,008 disease-drug links, connections with negative scores suggest new indications for existing drugs, such as the use of some antimalaria drugs for Crohn's disease, and a variety of existing drugs for Huntington's disease; while the positive scoring connections can aid in drug side effect identification, such as tamoxifen's undesired carcinogenic property. From the approximately 37K drug-drug relationships, we discover relationships that aid in target and pathway deconvolution, such as 1) KCNMA1 as a potential molecular target of lobeline, and 2) both apoptotic DNA fragmentation and G2/M DNA damage checkpoint regulation as potential pathway targets of daunorubicin. CONCLUSIONS/SIGNIFICANCE: We have automatically generated thousands of disease and drug expression profiles using GEO datasets, and constructed a large scale disease-drug network for effective and efficient drug repositioning as well as drug target/pathway identification.
Published in July 2009
READ PUBLICATION →

Drug discovery using chemical systems biology: repositioning the safe medicine Comtan to treat multi-drug and extensively drug resistant tuberculosis.

Authors: Kinnings SL, Liu N, Buchmeier N, Tonge PJ, Xie L, Bourne PE

Abstract: The rise of multi-drug resistant (MDR) and extensively drug resistant (XDR) tuberculosis around the world, including in industrialized nations, poses a great threat to human health and defines a need to develop new, effective and inexpensive anti-tubercular agents. Previously we developed a chemical systems biology approach to identify off-targets of major pharmaceuticals on a proteome-wide scale. In this paper we further demonstrate the value of this approach through the discovery that existing commercially available drugs, prescribed for the treatment of Parkinson's disease, have the potential to treat MDR and XDR tuberculosis. These drugs, entacapone and tolcapone, are predicted to bind to the enzyme InhA and directly inhibit substrate binding. The prediction is validated by in vitro and InhA kinetic assays using tablets of Comtan, whose active component is entacapone. The minimal inhibition concentration (MIC(99)) of entacapone for Mycobacterium tuberculosis (M.tuberculosis) is approximately 260.0 microM, well below the toxicity concentration determined by an in vitro cytotoxicity model using a human neuroblastoma cell line. Moreover, kinetic assays indicate that Comtan inhibits InhA activity by 47.0% at an entacapone concentration of approximately 80 microM. Thus the active component in Comtan represents a promising lead compound for developing a new class of anti-tubercular therapeutics with excellent safety profiles. More generally, the protocol described in this paper can be included in a drug discovery pipeline in an effort to discover novel drug leads with desired safety profiles, and therefore accelerate the development of new drugs.
Published in July 2009
READ PUBLICATION →

Harvesting candidate genes responsible for serious adverse drug reactions from a chemical-protein interactome.

Authors: Yang L, Chen J, He L

Abstract: Identifying genetic factors responsible for serious adverse drug reaction (SADR) is of critical importance to personalized medicine. However, genome-wide association studies are hampered due to the lack of case-control samples, and the selection of candidate genes is limited by the lack of understanding of the underlying mechanisms of SADRs. We hypothesize that drugs causing the same type of SADR might share a common mechanism by targeting unexpectedly the same SADR-mediating protein. Hence we propose an approach of identifying the common SADR-targets through constructing and mining an in silico chemical-protein interactome (CPI), a matrix of binding strengths among 162 drug molecules known to cause at least one type of SADR and 845 proteins. Drugs sharing the same SADR outcome were also found to possess similarities in their CPI profiles towards this 845 protein set. This methodology identified the candidate gene of sulfonamide-induced toxic epidermal necrolysis (TEN): all nine sulfonamides that cause TEN were found to bind strongly to MHC I (Cw*4), whereas none of the 17 control drugs that do not cause TEN were found to bind to it. Through an insight into the CPI, we found the Y116S substitution of MHC I (B*5703) enhances the unexpected binding of abacavir to its antigen presentation groove, which explains why B*5701, not B*5703, is the risk allele of abacavir-induced hypersensitivity. In conclusion, SADR targets and the patient-specific off-targets could be identified through a systematic investigation of the CPI, generating important hypotheses for prospective experimental validation of the candidate genes.
Published in July 2009
READ PUBLICATION →

Building disease-specific drug-protein connectivity maps from molecular interaction networks and PubMed abstracts.

Authors: Li J, Zhu X, Chen JY

Abstract: The recently proposed concept of molecular connectivity maps enables researchers to integrate experimental measurements of genes, proteins, metabolites, and drug compounds under similar biological conditions. The study of these maps provides opportunities for future toxicogenomics and drug discovery applications. We developed a computational framework to build disease-specific drug-protein connectivity maps. We integrated gene/protein and drug connectivity information based on protein interaction networks and literature mining, without requiring gene expression profile information derived from drug perturbation experiments on disease samples. We described the development and application of this computational framework using Alzheimer's Disease (AD) as a primary example in three steps. First, molecular interaction networks were incorporated to reduce bias and improve relevance of AD seed proteins. Second, PubMed abstracts were used to retrieve enriched drug terms that are indirectly associated with AD through molecular mechanistic studies. Third and lastly, a comprehensive AD connectivity map was created by relating enriched drugs and related proteins in literature. We showed that this molecular connectivity map development approach outperformed both curated drug target databases and conventional information retrieval systems. Our initial explorations of the AD connectivity map yielded a new hypothesis that diltiazem and quinidine may be investigated as candidate drugs for AD treatment. Molecular connectivity maps derived computationally can help study molecular signature differences between different classes of drugs in specific disease contexts. To achieve overall good data coverage and quality, a series of statistical methods have been developed to overcome high levels of data noise in biological networks and literature mining results. Further development of computational molecular connectivity maps to cover major disease areas will likely set up a new model for drug development, in which therapeutic/toxicological profiles of candidate drugs can be checked computationally before costly clinical trials begin.
Published on July 30, 2009
READ PUBLICATION →

Discovery: an interactive resource for the rational selection and comparison of putative drug target proteins in malaria.

Authors: Joubert F, Harrison CM, Koegelenberg RJ, Odendaal CJ, de Beer TA

Abstract: BACKGROUND: Up to half a billion human clinical cases of malaria are reported each year, resulting in about 2.7 million deaths, most of which occur in sub-Saharan Africa. Due to the over-and misuse of anti-malarials, widespread resistance to all the known drugs is increasing at an alarming rate. Rational methods to select new drug target proteins and lead compounds are urgently needed. The Discovery system provides data mining functionality on extensive annotations of five malaria species together with the human and mosquito hosts, enabling the selection of new targets based on multiple protein and ligand properties. METHODS: A web-based system was developed where researchers are able to mine information on malaria proteins and predicted ligands, as well as perform comparisons to the human and mosquito host characteristics. Protein features used include: domains, motifs, EC numbers, GO terms, orthologs, protein-protein interactions, protein-ligand interactions and host-pathogen interactions among others. Searching by chemical structure is also available. RESULTS: An in silico system for the selection of putative drug targets and lead compounds is presented, together with an example study on the bifunctional DHFR-TS from Plasmodium falciparum. CONCLUSION: The Discovery system allows for the identification of putative drug targets and lead compounds in Plasmodium species based on the filtering of protein and chemical properties.
Published on July 27, 2009
READ PUBLICATION →

CanGeneBase (CGB)--a database on cancer related genes.

Authors: Kumar GR, Subazini TK, Subha K, Rajadurai CP, Prabakar L

Abstract: UNLABELLED: The advent of genomic and proteomic technologies in this post-genomic era has urged the researchers to develop novel research strategies against cancer by targeting the human genes that would greatly facilitate to identify more promising treatment and to develop accurate early diagnosis for cancer. To harness the power of cancer genetic information towards better treatment we have developed a cancer gene database called CanGeneBase (CGB). It is a comprehensive data collection of cancer-related genes with the intention of helping the researchers to stay on a single platform to gain exclusive information on the genes of their interest. According to the Cancer Gene Data Curation Project, about 4,700 genes have been identified as being related to cancer. The present CanGeneBase covers about 12 different types of cancer which includes 190 unique gene entries. Each entry encompasses about 33 useful parameters to provide detailed information about specific gene. CanGeneBase is made in such a way that it can be easily accessed by either gene symbol or by the type of cancer. AVAILABILITY: The database is freely available at http://122.165.25.137/bioinfo/cancerdb/
Published on July 10, 2009
READ PUBLICATION →

Genomes2Drugs: identifies target proteins and lead drugs from proteome data.

Authors: Toomey D, Hoppe HC, Brennan MP, Nolan KB, Chubb AJ

Abstract: BACKGROUND: Genome sequencing and bioinformatics have provided the full hypothetical proteome of many pathogenic organisms. Advances in microarray and mass spectrometry have also yielded large output datasets of possible target proteins/genes. However, the challenge remains to identify new targets for drug discovery from this wealth of information. Further analysis includes bioinformatics and/or molecular biology tools to validate the findings. This is time consuming and expensive, and could fail to yield novel drugs if protein purification and crystallography is impossible. To pre-empt this, a researcher may want to rapidly filter the output datasets for proteins that show good homology to proteins that have already been structurally characterised or proteins that are already targets for known drugs. Critically, those researchers developing novel antibiotics need to select out the proteins that show close homology to any human proteins, as future inhibitors are likely to cross-react with the host protein, causing off-target toxicity effects later in clinical trials. METHODOLOGY/PRINCIPAL FINDINGS: To solve many of these issues, we have developed a free online resource called Genomes2Drugs which ranks sequences to identify proteins that are (i) homologous to previously crystallized proteins or (ii) targets of known drugs, but are (iii) not homologous to human proteins. When tested using the Plasmodium falciparum malarial genome the program correctly enriched the ranked list of proteins with known drug target proteins. CONCLUSIONS/SIGNIFICANCE: Genomes2Drugs rapidly identifies proteins that are likely to succeed in drug discovery pipelines. This free online resource helps in the identification of potential drug targets. Importantly, the program further highlights proteins that are likely to be inhibited by FDA-approved drugs. These drugs can then be rapidly moved into Phase IV clinical studies under 'change-of-application' patents.
Published on July 9, 2009
READ PUBLICATION →

Hmrbase: a database of hormones and their receptors.

Authors: Rashid M, Singla D, Sharma A, Kumar M, Raghava GP

Abstract: BACKGROUND: Hormones are signaling molecules that play vital roles in various life processes, like growth and differentiation, physiology, and reproduction. These molecules are mostly secreted by endocrine glands, and transported to target organs through the bloodstream. Deficient, or excessive, levels of hormones are associated with several diseases such as cancer, osteoporosis, diabetes etc. Thus, it is important to collect and compile information about hormones and their receptors. DESCRIPTION: This manuscript describes a database called Hmrbase which has been developed for managing information about hormones and their receptors. It is a highly curated database for which information has been collected from the literature and the public databases. The current version of Hmrbase contains comprehensive information about approximately 2000 hormones, e.g., about their function, source organism, receptors, mature sequences, structures etc. Hmrbase also contains information about approximately 3000 hormone receptors, in terms of amino acid sequences, subcellular localizations, ligands, and post-translational modifications etc. One of the major features of this database is that it provides data about approximately 4100 hormone-receptor pairs. A number of online tools have been integrated into the database, to provide the facilities like keyword search, structure-based search, mapping of a given peptide(s) on the hormone/receptor sequence, sequence similarity search. This database also provides a number of external links to other resources/databases in order to help in the retrieving of further related information. CONCLUSION: Owing to the high impact of endocrine research in the biomedical sciences, the Hmrbase could become a leading data portal for researchers. The salient features of Hmrbase are hormone-receptor pair-related information, mapping of peptide stretches on the protein sequences of hormones and receptors, Pfam domain annotations, categorical browsing options, online data submission, DrugPedia linkage etc. Hmrbase is available online for public from http://crdd.osdd.net/raghava/hmrbase/.
Published on July 6, 2009
READ PUBLICATION →

Quantitative assessment of the expanding complementarity between public and commercial databases of bioactive compounds.

Authors: Southan C, Varkonyi P, Muresan S

Abstract: BACKGROUND: Since 2004 public cheminformatic databases and their collective functionality for exploring relationships between compounds, protein sequences, literature and assay data have advanced dramatically. In parallel, commercial sources that extract and curate such relationships from journals and patents have also been expanding. This work updates a previous comparative study of databases chosen because of their bioactive content, availability of downloads and facility to select informative subsets. RESULTS: Where they could be calculated, extracted compounds-per-journal article were in the range of 12 to 19 but compound-per-protein counts increased with document numbers. Chemical structure filtration to facilitate standardised comparisons typically reduced source counts by between 5% and 30%. The pair-wise overlaps between 23 databases and subsets were determined, as well as changes between 2006 and 2008. While all compound sets have increased, PubChem has doubled to 14.2 million. The 2008 comparison matrix shows not only overlap but also unique content across all sources. Many of the detailed differences could be attributed to individual strategies for data selection and extraction. While there was a big increase in patent-derived structures entering PubChem since 2006, GVKBIO contains over 0.8 million unique structures from this source. Venn diagrams showed extensive overlap between compounds extracted by independent expert curation from journals by GVKBIO, WOMBAT (both commercial) and BindingDB (public) but each included unique content. In contrast, the approved drug collections from GVKBIO, MDDR (commercial) and DrugBank (public) showed surprisingly low overlap. Aggregating all commercial sources established that while 1 million compounds overlapped with PubChem 1.2 million did not. CONCLUSION: On the basis of chemical structure content per se public sources have covered an increasing proportion of commercial databases over the last two years. However, commercial products included in this study provide links between compounds and information from patents and journals at a larger scale than current public efforts. They also continue to capture a significant proportion of unique content. Our results thus demonstrate not only an encouraging overall expansion of data-supported bioactive chemical space but also that both commercial and public sources are complementary for its exploration.
Published in June 2009
READ PUBLICATION →

Integrating statistical predictions and experimental verifications for enhancing protein-chemical interaction predictions in virtual screening.

Authors: Nagamine N, Shirakawa T, Minato Y, Torii K, Kobayashi H, Imoto M, Sakakibara Y

Abstract: Predictions of interactions between target proteins and potential leads are of great benefit in the drug discovery process. We present a comprehensively applicable statistical prediction method for interactions between any proteins and chemical compounds, which requires only protein sequence data and chemical structure data and utilizes the statistical learning method of support vector machines. In order to realize reasonable comprehensive predictions which can involve many false positives, we propose two approaches for reduction of false positives: (i) efficient use of multiple statistical prediction models in the framework of two-layer SVM and (ii) reasonable design of the negative data to construct statistical prediction models. In two-layer SVM, outputs produced by the first-layer SVM models, which are constructed with different negative samples and reflect different aspects of classifications, are utilized as inputs to the second-layer SVM. In order to design negative data which produce fewer false positive predictions, we iteratively construct SVM models or classification boundaries from positive and tentative negative samples and select additional negative sample candidates according to pre-determined rules. Moreover, in order to fully utilize the advantages of statistical learning methods, we propose a strategy to effectively feedback experimental results to computational predictions with consideration of biological effects of interest. We show the usefulness of our approach in predicting potential ligands binding to human androgen receptors from more than 19 million chemical compounds and verifying these predictions by in vitro binding. Moreover, we utilize this experimental validation as feedback to enhance subsequent computational predictions, and experimentally validate these predictions again. This efficient procedure of the iteration of the in silico prediction and in vitro or in vivo experimental verifications with the sufficient feedback enabled us to identify novel ligand candidates which were distant from known ligands in the chemical space.