Published on October 31, 2019

The actinobacterium Tsukamurella paurometabola has a functionally divergent arylamine N-acetyltransferase (NAT) homolog.

Authors: Garefalaki V, Kontomina E, Ioannidis C, Savvidou O, Vagena-Pantoula C, Papavergi MG, Olbasalis I, Patriarcheas D, Fylaktakidou KC, Felfoldi T, Marialigeti K, Fakis G, Boukouvala S

Abstract: Actinobacteria in the Tsukamurella genus are aerobic, high-GC, Gram-positive mycolata, considered as opportunistic pathogens and isolated from various environmental sources, including sites contaminated with oil, urban or industrial waste and pesticides. Although studies look into xenobiotic biotransformation by Tsukamurella isolates, the relevant enzymes remain uncharacterized. We investigated the arylamine N-acetyltransferase (NAT) enzyme family, known for its role in the xenobiotic metabolism of prokaryotes and eukaryotes. Xenobiotic sensitivity of Tsukamurella paurometabola type strain DSM 20162(T) was assessed, followed by cloning, recombinant expression and functional characterization of its single NAT homolog (TSUPD)NAT1. The bacterium appeared quite robust against chloroanilines, but more sensitive to 4-anisidine and 2-aminophenol. However, metabolic activity was not evident towards those compounds, presumably due to mechanisms protecting cells from xenobiotic entry. Of the pharmaceutical arylhydrazines tested, hydralazine was toxic, but the bacterium was less sensitive to isoniazid, a drug targeting mycolic acid biosynthesis in mycobacteria. Although (TSUPD)NAT1 protein has an atypical Cys-His-Glu (instead of the expected Cys-His-Asp) catalytic triad, it is enzymatically active, suggesting that this deviation is likely due to evolutionary adaptation potentially serving a different function. The protein was indeed found to use malonyl-CoA, instead of the archetypal acetyl-CoA, as its preferred donor substrate. Malonyl-CoA is important for microbial biosynthesis of fatty acids (including mycolic acids) and polyketide chains, and the corresponding enzymatic systems have common evolutionary histories, also linked to xenobiotic metabolism. This study adds to accummulating evidence suggesting broad phylogenetic and functional divergence of microbial NAT enzymes that goes beyond xenobiotic metabolism and merits investigation.
Published in October 2019

DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening.

Authors: Wan F, Zhu Y, Hu H, Dai A, Cai X, Chen L, Gong H, Xia T, Yang D, Wang MW, Zeng J

Abstract: Accurate identification of compound-protein interactions (CPIs) in silico may deepen our understanding of the underlying mechanisms of drug action and thus remarkably facilitate drug discovery and development. Conventional similarity- or docking-based computational methods for predicting CPIs rarely exploit latent features from currently available large-scale unlabeled compound and protein data and often limit their usage to relatively small-scale datasets. In the present study, we propose DeepCPI, a novel general and scalable computational framework that combines effective feature embedding (a technique of representation learning) with powerful deep learning methods to accurately predict CPIs at a large scale. DeepCPI automatically learns the implicit yet expressive low-dimensional features of compounds and proteins from a massive amount of unlabeled data. Evaluations of the measured CPIs in large-scale databases, such as ChEMBL and BindingDB, as well as of the known drug-target interactions from DrugBank, demonstrated the superior predictive performance of DeepCPI. Furthermore, several interactions among small-molecule compounds and three G protein-coupled receptor targets (glucagon-like peptide-1 receptor, glucagon receptor, and vasoactive intestinal peptide receptor) predicted using DeepCPI were experimentally validated. The present study suggests that DeepCPI is a useful and powerful tool for drug discovery and repositioning. The source code of DeepCPI can be downloaded from
Published in October 2019

ABCD: Alzheimer's disease Biomarkers Comprehensive Database.

Authors: Kumar A, Bansal A, Singh TR

Abstract: Alzheimer's disease (AD) is an age-related, non-reversible, and progressive brain disorder. Memory loss, confusion, and personality changes are major symptoms noticed. AD ultimately leads to a severe loss of mental function. Due to lack of effective biomarkers, no effective medication was available for the complete treatment of AD. There is a need to provide all AD-related essential information to the scientific community. Our resource Alzheimer's disease Biomarkers Comprehensive Database (ABCD) is being planned to accomplish this objective. ABCD is a huge collection of AD-related data of molecular markers. The web interface contains information concerning the proteins, genes, transcription factors, SNPs, miRNAs, mitochondrial genes, and expressed genes implicated in AD pathogenesis. In addition to the molecular-level data, the database has information for animal models, medicinal candidates and pathways involved in the AD and some image data for AD patients. ABCD is coupled with some major external resources where the user can retrieve additional general information about the disease. The database was designed in such a manner that user can extract meaningful information about gene, protein, pathway, and regulatory elements based search options. This database is unique in the sense that it is completely dedicated to specific neurological disorder i.e. AD. Further advance options like AD-affected brain image data of patients and structural compound level information add values to our database. Features of this database enable users to extract, analyze and display information related to a disease in many different ways. The database is available for academic purpose and accessible at
Published in October 2019

Gvoke HypoPen: An Auto-Injector Containing an Innovative, Liquid-Stable Glucagon Formulation for Use in Severe Acute Hypoglycemia.

Authors: Brand-Eubanks D

Published on October 29, 2019

SL-BioDP: Multi-Cancer Interactive Tool for Prediction of Synthetic Lethality and Response to Cancer Treatment.

Authors: Deng X, Das S, Valdez K, Camphausen K, Shankavaram U

Abstract: Synthetic lethality exploits the phenomenon that a mutation in a cancer gene is often associated with new vulnerability which can be uniquely targeted therapeutically, leading to a significant increase in favorable outcome. DNA damage and survival pathways are among the most commonly mutated networks in human cancers. Recent data suggest that synthetic lethal interactions between a tumor defect and a DNA repair pathway can be used to preferentially kill tumor cells. We recently published a method, DiscoverSL, using multi-omic cancer data, that can predict synthetic lethal interactions of potential clinical relevance. Here, we apply the generality of our models in a comprehensive web tool called Synthetic Lethality Bio Discovery Portal (SL-BioDP) and extend the cancer types to 18 cancer genome atlas cohorts. SL-BioDP enables a data-driven computational approach to predict synthetic lethal interactions from hallmark cancer pathways by mining cancer's genomic and chemical interactions. Our tool provides queries and visualizations for exploring potentially targetable synthetic lethal interactions, shows Kaplan-Meier plots of clinical relevance, and provides in silico validation using short hairpin RNA (shRNA) and drug efficacy data. Our method would thus shed light on mechanisms of synthetic lethal interactions and lead to the discovery of novel anticancer drugs.
Published on October 28, 2019

Network inference with ensembles of bi-clustering trees.

Authors: Pliakos K, Vens C

Abstract: BACKGROUND: Network inference is crucial for biomedicine and systems biology. Biological entities and their associations are often modeled as interaction networks. Examples include drug protein interaction or gene regulatory networks. Studying and elucidating such networks can lead to the comprehension of complex biological processes. However, usually we have only partial knowledge of those networks and the experimental identification of all the existing associations between biological entities is very time consuming and particularly expensive. Many computational approaches have been proposed over the years for network inference, nonetheless, efficiency and accuracy are still persisting open problems. Here, we propose bi-clustering tree ensembles as a new machine learning method for network inference, extending the traditional tree-ensemble models to the global network setting. The proposed approach addresses the network inference problem as a multi-label classification task. More specifically, the nodes of a network (e.g., drugs or proteins in a drug-protein interaction network) are modelled as samples described by features (e.g., chemical structure similarities or protein sequence similarities). The labels in our setting represent the presence or absence of links connecting the nodes of the interaction network (e.g., drug-protein interactions in a drug-protein interaction network). RESULTS: We extended traditional tree-ensemble methods, such as extremely randomized trees (ERT) and random forests (RF) to ensembles of bi-clustering trees, integrating background information from both node sets of a heterogeneous network into the same learning framework. We performed an empirical evaluation, comparing the proposed approach to currently used tree-ensemble based approaches as well as other approaches from the literature. We demonstrated the effectiveness of our approach in different interaction prediction (network inference) settings. For evaluation purposes, we used several benchmark datasets that represent drug-protein and gene regulatory networks. We also applied our proposed method to two versions of a chemical-protein association network extracted from the STITCH database, demonstrating the potential of our model in predicting non-reported interactions. CONCLUSIONS: Bi-clustering trees outperform existing tree-based strategies as well as machine learning methods based on other algorithms. Since our approach is based on tree-ensembles it inherits the advantages of tree-ensemble learning, such as handling of missing values, scalability and interpretability.
Published on October 23, 2019

Multi-tissue network analysis for drug prioritization in knee osteoarthritis.

Authors: Neidlin M, Dimitrakopoulou S, Alexopoulos LG

Abstract: Knee osteoarthritis (OA) is a joint disease that affects several tissues: cartilage, synovium, meniscus and subchondral bone. The pathophysiology of this complex disease is still not completely understood and existing pharmaceutical strategies are limited to pain relief treatments. Therefore, a computational method was developed considering the diverse mechanisms and the multi-tissue nature of OA in order to suggest pharmaceutical compounds. Specifically, weighted gene co-expression network analysis (WGCNA) was utilized to identify gene modules that were preserved across four joint tissues. The driver genes of these modules were selected as an input for a network-based drug discovery approach. WGCNA identified two preserved modules that described functions related to extracellular matrix physiology and immune system responses. Compounds that affected various anti-inflammatory pathways and drugs targeted at coagulation pathways were suggested. 9 out of the top 10 compounds had a proven association with OA and significantly outperformed randomized approaches not including WGCNA. The method presented herein is a viable strategy to identify overlapping molecular mechanisms in multi-tissue diseases such as OA and employ this information for drug discovery and compound prioritization.
Published on October 12, 2019

Biological Network Approaches and Applications in Rare Disease Studies.

Authors: Zhang P, Itan Y

Abstract: Network biology has the capability to integrate, represent, interpret, and model complex biological systems by collectively accommodating biological omics data, biological interactions and associations, graph theory, statistical measures, and visualizations. Biological networks have recently been shown to be very useful for studies that decipher biological mechanisms and disease etiologies and for studies that predict therapeutic responses, at both the molecular and system levels. In this review, we briefly summarize the general framework of biological network studies, including data resources, network construction methods, statistical measures, network topological properties, and visualization tools. We also introduce several recent biological network applications and methods for the studies of rare diseases.
Published on October 11, 2019

Drug Side-Effect Prediction Via Random Walk on the Signed Heterogeneous Drug Network.

Authors: Hu B, Wang H, Yu Z

Abstract: Drug side-effects have become a major public health concern as they are the underlying cause of over a million serious injuries and deaths each year. Therefore, it is of critical importance to detect side-effects as early as possible. Existing computational methods mainly utilize the drug chemical profile and the drug biological profile to predict the side-effects of a drug. In the utilized drug biological profile information, they only focus on drug-target interactions and neglect the modes of action of drugs on target proteins. In this paper, we develop a new method for predicting potential side-effects of drugs based on more comprehensive drug information in which the modes of action of drugs on target proteins are integrated. Drug information of multiple types is modeled as a signed heterogeneous information network. We propose a signed heterogeneous information network embedding framework for learning drug embeddings and predicting side-effects of drugs. We use two bias random walk procedures to obtain drug sequences and train a Skip-gram model to learn drug embeddings. We experimentally demonstrate the performance of the proposed method by comparison with state-of-the-art methods. Furthermore, the results of a case study support our hypothesis that modes of action of drugs on target proteins are meaningful in side-effect prediction.
Published on October 10, 2019

Comparison of Large Chemical Spaces.

Authors: Lessel U, Lemmen C

Abstract: Chemical libraries are commonplace in computer-aided drug discovery, and assessing their overlap/complementarity is a routine task. For this purpose, different techniques are applied, ranging from exact matching to comparing physicochemical properties. However, these techniques are applicable only if the compound sets are not too big. Particularly for chemical spaces, containing billions of compounds, alternative ways of assessment are required. Random subsets could be enumerated and compared one-to-one, but given the vast sizes of the chemical spaces assessed here, such samples can at best provide a rough estimate of any overlap. Here we describe a novel way to compare chemical spaces utilizing a panel of query compounds. We applied this technique to three different types of spaces and obtained insight into their structural overlap, their coverage of the chemical universe, and their density. As chemical feasibility of virtual compounds is particularly important, we included related in silico predictions in our assessment.