Publications Search
Explore how scientists all over the world use DrugBank in their research.
Published in December 2014
READ PUBLICATION →

Cancer in silico drug discovery: a systems biology tool for identifying candidate drugs to target specific molecular tumor subtypes.

Authors: San Lucas FA, Fowler J, Chang K, Kopetz S, Vilar E, Scheet P

Abstract: Large-scale cancer datasets such as The Cancer Genome Atlas (TCGA) allow researchers to profile tumors based on a wide range of clinical and molecular characteristics. Subsequently, TCGA-derived gene expression profiles can be analyzed with the Connectivity Map (CMap) to find candidate drugs to target tumors with specific clinical phenotypes or molecular characteristics. This represents a powerful computational approach for candidate drug identification, but due to the complexity of TCGA and technology differences between CMap and TCGA experiments, such analyses are challenging to conduct and reproduce. We present Cancer in silico Drug Discovery (CiDD; scheet.org/software), a computational drug discovery platform that addresses these challenges. CiDD integrates data from TCGA, CMap, and Cancer Cell Line Encyclopedia (CCLE) to perform computational drug discovery experiments, generating hypotheses for the following three general problems: (i) determining whether specific clinical phenotypes or molecular characteristics are associated with unique gene expression signatures; (ii) finding candidate drugs to repress these expression signatures; and (iii) identifying cell lines that resemble the tumors being studied for subsequent in vitro experiments. The primary input to CiDD is a clinical or molecular characteristic. The output is a biologically annotated list of candidate drugs and a list of cell lines for in vitro experimentation. We applied CiDD to identify candidate drugs to treat colorectal cancers harboring mutations in BRAF. CiDD identified EGFR and proteasome inhibitors, while proposing five cell lines for in vitro testing. CiDD facilitates phenotype-driven, systematic drug discovery based on clinical and molecular data from TCGA.
Published in 2014
READ PUBLICATION →

Network-based analysis reveals distinct association patterns in a semantic MEDLINE-based drug-disease-gene network.

Authors: Zhang Y, Tao C, Jiang G, Nair AA, Su J, Chute CG, Liu H

Abstract: BACKGROUND: A huge amount of associations among different biological entities (e.g., disease, drug, and gene) are scattered in millions of biomedical articles. Systematic analysis of such heterogeneous data can infer novel associations among different biological entities in the context of personalized medicine and translational research. Recently, network-based computational approaches have gained popularity in investigating such heterogeneous data, proposing novel therapeutic targets and deciphering disease mechanisms. However, little effort has been devoted to investigating associations among drugs, diseases, and genes in an integrative manner. RESULTS: We propose a novel network-based computational framework to identify statistically over-expressed subnetwork patterns, called network motifs, in an integrated disease-drug-gene network extracted from Semantic MEDLINE. The framework consists of two steps. The first step is to construct an association network by extracting pair-wise associations between diseases, drugs and genes in Semantic MEDLINE using a domain pattern driven strategy. A Resource Description Framework (RDF)-linked data approach is used to re-organize the data to increase the flexibility of data integration, the interoperability within domain ontologies, and the efficiency of data storage. Unique associations among drugs, diseases, and genes are extracted for downstream network-based analysis. The second step is to apply a network-based approach to mine the local network structure of this heterogeneous network. Significant network motifs are then identified as the backbone of the network. A simplified network based on those significant motifs is then constructed to facilitate discovery. We implemented our computational framework and identified five network motifs, each of which corresponds to specific biological meanings. Three case studies demonstrate that novel associations are derived from the network topology analysis of reconstructed networks of significant network motifs, further validated by expert knowledge and functional enrichment analyses. CONCLUSIONS: We have developed a novel network-based computational approach to investigate the heterogeneous drug-gene-disease network extracted from Semantic MEDLINE. We demonstrate the power of this approach by prioritizing candidate disease genes, inferring potential disease relationships, and proposing novel drug targets, within the context of the entire knowledge. The results indicate that such approach will facilitate the formulization of novel research hypotheses, which is critical for translational medicine research and personalized medicine.
Published in 2014
READ PUBLICATION →

A knowledge base for the discovery of function, diagnostic potential and drug effects on cellular and extracellular miRNAs.

Authors: Russo F, Di Bella S, Bonnici V, Lagana A, Rainaldi G, Pellegrini M, Pulvirenti A, Giugno R, Ferro A

Abstract: BACKGROUND: MicroRNAs (miRNAs) are small noncoding RNAs that play an important role in the regulation of various biological processes through their interaction with cellular mRNAs. A significant amount of miRNAs has been found in extracellular human body fluids (e.g. plasma and serum) and some circulating miRNAs in the blood have been successfully revealed as biomarkers for diseases including cardiovascular diseases and cancer. Released miRNAs do not necessarily reflect the abundance of miRNAs in the cell of origin. It is claimed that release of miRNAs from cells into blood and ductal fluids is selective and that the selection of released miRNAs may correlate with malignancy. Moreover, miRNAs play a significant role in pharmacogenomics by down-regulating genes that are important for drug function. In particular, the use of drugs should be taken into consideration while analyzing plasma miRNA levels as drug treatment. This may impair their employment as biomarkers. DESCRIPTION: We enriched our manually curated extracellular/circulating microRNAs database, miRandola, by providing (i) a systematic comparison of expression profiles of cellular and extracellular miRNAs, (ii) a miRNA targets enrichment analysis procedure, (iii) information on drugs and their effect on miRNA expression, obtained by applying a natural language processing algorithm to abstracts obtained from PubMed. CONCLUSIONS: This allows users to improve the knowledge about the function, diagnostic potential, and the drug effects on cellular and circulating miRNAs.
Published in December 2014
READ PUBLICATION →

Identifying plausible adverse drug reactions using knowledge extracted from the literature.

Authors: Shang N, Xu H, Rindflesch TC, Cohen T

Abstract: Pharmacovigilance involves continually monitoring drug safety after drugs are put to market. To aid this process; algorithms for the identification of strongly correlated drug/adverse drug reaction (ADR) pairs from data sources such as adverse event reporting systems or Electronic Health Records have been developed. These methods are generally statistical in nature, and do not draw upon the large volumes of knowledge embedded in the biomedical literature. In this paper, we investigate the ability of scalable Literature Based Discovery (LBD) methods to identify side effects of pharmaceutical agents. The advantage of LBD methods is that they can provide evidence from the literature to support the plausibility of a drug/ADR association, thereby assisting human review to validate the signal, which is an essential component of pharmacovigilance. To do so, we draw upon vast repositories of knowledge that has been extracted from the biomedical literature by two Natural Language Processing tools, MetaMap and SemRep. We evaluate two LBD methods that scale comfortably to the volume of knowledge available in these repositories. Specifically, we evaluate Reflective Random Indexing (RRI), a model based on concept-level co-occurrence, and Predication-based Semantic Indexing (PSI), a model that encodes the nature of the relationship between concepts to support reasoning analogically about drug-effect relationships. An evaluation set was constructed from the Side Effect Resource 2 (SIDER2), which contains known drug/ADR relations, and models were evaluated for their ability to "rediscover" these relations. In this paper, we demonstrate that both RRI and PSI can recover known drug-adverse event associations. However, PSI performed better overall, and has the additional advantage of being able to recover the literature underlying the reasoning pathways it used to make its predictions.
Published in 2014
READ PUBLICATION →

On InChI and evaluating the quality of cross-reference links.

Authors: Galgonek J, Vondrasek J

Abstract: BACKGROUND: There are many databases of small molecules focused on different aspects of research and its applications. Some tasks may require integration of information from various databases. However, determining which entries from different databases represent the same compound is not straightforward. Integration can be based, for example, on automatically generated cross-reference links between entries. Another approach is to use the manually curated links stored directly in databases. This study employs well-established InChI identifiers to measure the consistency and completeness of the manually curated links by comparing them with the automatically generated ones. RESULTS: We used two different tools to generate InChI identifiers and observed some ambiguities in their outputs. In part, these ambiguities were caused by indistinctness in interpretation of the structural data used. InChI identifiers were used successfully to find duplicate entries in databases. We found that the InChI inconsistencies in the manually curated links are very high (28.85% in the worst case). Even using a weaker definition of consistency, the measured values were very high in general. The completeness of the manually curated links was also very poor (only 93.8% in the best case) compared with that of the automatically generated links. CONCLUSIONS: We observed several problems with the InChI tools and the files used as their inputs. There are large gaps in the consistency and completeness of manually curated links if they are measured using InChI identifiers. However, inconsistency can be caused both by errors in manually curated links and the inherent limitations of the InChI method.
Published in 2014
READ PUBLICATION →

Identification of levothyroxine antichagasic activity through computer-aided drug repurposing.

Authors: Bellera CL, Balcazar DE, Alberca L, Labriola CA, Talevi A, Carrillo C

Abstract: Cruzipain (Cz) is the major cysteine protease of the protozoan Trypanosoma cruzi, etiological agent of Chagas disease. A conformation-independent classifier capable of identifying Cz inhibitors was derived from a 163-compound dataset and later applied in a virtual screening campaign on the DrugBank database, which compiles FDA-approved and investigational drugs. 54 approved drugs were selected as candidates, 3 of which were acquired and tested on Cz and T. cruzi epimastigotes proliferation. Among them, levothyroxine, traditionally used in hormone replacement therapy in patients with hypothyroidism, showed dose-dependent inhibition of Cz and antiproliferative activity on the parasite.
Published in 2014
READ PUBLICATION →

Virtual screening of natural inhibitors to the predicted HBx protein structure of Hepatitis B Virus using molecular docking for identification of potential lead molecules for liver cancer.

Authors: Pathak RK, Baunthiyal M, Taj G, Kumar A

Abstract: The HBx protein in Hepatitis B Virus (HBV) is a potential target for anti-liver cancer molecules. Therefore, it is of interest to screen known natural compounds against the HBx protein using molecular docking. However, the structure of HBx is not yet known. Therefore, the predicted structure of HBx using threading in LOMET was used for docking against plant derived natural compounds (curcumin, oleanolic acid, resveratrol, bilobetin, luteoline, ellagic acid, betulinic acid and rutin) by Molegro Virtual Docker. The screening identified rutin with binding energy of -161.65 Kcal/mol. Thus, twenty derivatives of rutin were further designed and screened against HBx. These in silico experiments identified compounds rutin01 (-163.16 Kcal/mol) and rutin08 (- 165.76 Kcal/mol) for further consideration and downstream validation.
Published in 2014
READ PUBLICATION →

TCMSP: a database of systems pharmacology for drug discovery from herbal medicines.

Authors: Ru J, Li P, Wang J, Zhou W, Li B, Huang C, Li P, Guo Z, Tao W, Yang Y, Xu X, Li Y, Wang Y, Yang L

Abstract: BACKGROUND: Modern medicine often clashes with traditional medicine such as Chinese herbal medicine because of the little understanding of the underlying mechanisms of action of the herbs. In an effort to promote integration of both sides and to accelerate the drug discovery from herbal medicines, an efficient systems pharmacology platform that represents ideal information convergence of pharmacochemistry, ADME properties, drug-likeness, drug targets, associated diseases and interaction networks, are urgently needed. DESCRIPTION: The traditional Chinese medicine systems pharmacology database and analysis platform (TCMSP) was built based on the framework of systems pharmacology for herbal medicines. It consists of all the 499 Chinese herbs registered in the Chinese pharmacopoeia with 29,384 ingredients, 3,311 targets and 837 associated diseases. Twelve important ADME-related properties like human oral bioavailability, half-life, drug-likeness, Caco-2 permeability, blood-brain barrier and Lipinski's rule of five are provided for drug screening and evaluation. TCMSP also provides drug targets and diseases of each active compound, which can automatically establish the compound-target and target-disease networks that let users view and analyze the drug action mechanisms. It is designed to fuel the development of herbal medicines and to promote integration of modern medicine and traditional medicine for drug discovery and development. CONCLUSIONS: The particular strengths of TCMSP are the composition of the large number of herbal entries, and the ability to identify drug-target networks and drug-disease networks, which will help revealing the mechanisms of action of Chinese herbs, uncovering the nature of TCM theory and developing new herb-oriented drugs. TCMSP is freely available at http://sm.nwsuaf.edu.cn/lsp/tcmsp.php.
Published in 2014
READ PUBLICATION →

Automated detection of off-label drug use.

Authors: Jung K, LePendu P, Chen WS, Iyer SV, Readhead B, Dudley JT, Shah NH

Abstract: Off-label drug use, defined as use of a drug in a manner that deviates from its approved use defined by the drug's FDA label, is problematic because such uses have not been evaluated for safety and efficacy. Studies estimate that 21% of prescriptions are off-label, and only 27% of those have evidence of safety and efficacy. We describe a data-mining approach for systematically identifying off-label usages using features derived from free text clinical notes and features extracted from two databases on known usage (Medi-Span and DrugBank). We trained a highly accurate predictive model that detects novel off-label uses among 1,602 unique drugs and 1,472 unique indications. We validated 403 predicted uses across independent data sources. Finally, we prioritize well-supported novel usages for further investigation on the basis of drug safety and cost.
Published in 2014
READ PUBLICATION →

Prediction of cancer drugs by chemical-chemical interactions.

Authors: Lu J, Huang G, Li HP, Feng KY, Chen L, Zheng MY, Cai YD

Abstract: Cancer, which is a leading cause of death worldwide, places a big burden on health-care system. In this study, an order-prediction model was built to predict a series of cancer drug indications based on chemical-chemical interactions. According to the confidence scores of their interactions, the order from the most likely cancer to the least one was obtained for each query drug. The 1(st) order prediction accuracy of the training dataset was 55.93%, evaluated by Jackknife test, while it was 55.56% and 59.09% on a validation test dataset and an independent test dataset, respectively. The proposed method outperformed a popular method based on molecular descriptors. Moreover, it was verified that some drugs were effective to the 'wrong' predicted indications, indicating that some 'wrong' drug indications were actually correct indications. Encouraged by the promising results, the method may become a useful tool to the prediction of drugs indications.