Understanding life at the molecular level using data science
Our group uses multidisciplinary approaches in biomedical data science to address fundamental questions with translational impact. Specifically, we focus on G-protein coupled receptors and intrinsically disordered proteins. 1) GPCRs are the largest family of protein receptors in the human genome. They play essential roles in the immune, hormone, cardiac, and respiratory systems. Variations in genetic sequence can alter the structure and dynamics of these proteins, giving rise to changes in the signaling messages they convey. 2) Over 40% of the human proteome do not adopt defined structures and are referred as “intrinsically disordered” regions (IDRs). They are poorly understood because they are not amenable to classical structural biology approaches. IDRs hold untapped potential for new therapeutic interventions. In both these areas, we apply a combination of systems approaches, big-data analysis, Machine Learning, and experimental validation to establish the roles of GPCRs and IDRs in biology and disease.
Our laboratory specializes in integrating 3D information from protein structures with human genetics and biochemical data. We aim to understand how changes in gene sequence alter protein function and cellular signaling. The goal is to understand the molecular basis of life using data science. If we can understand this, we can interpret why mutations cause disease and infer mechanisms that may allow us to treat or cure these disorders. Data science enables us to view biology using a different lens and address fundamental questions of a different magnitude and kind, complementing existing approaches.
We work to interrogate groups of proteins that are medically relevant or have unusual properties and focus on two major classes: G-protein coupled receptors (GPCRs) and intrinsically disordered protein regions (IDRs).
G-Protein Coupled Receptors
GPCRs belong to a unique family of proteins in that anything we learn from one can be extrapolated to other members, facilitating accelerated discovery. We are particularly interested in how GPCRs are regulated and how selectivity in signaling is achieved. GPCRs are excellent drug targets because of their physical location on the cell surface, but surprisingly few FDA-approved drugs targeting GPCRs have had a clinical impact in oncology. This is likely because slight alterations in the genome can cause significant variation in GPCR signaling, which causes preclinical failure of small molecules due to lack of efficacy or off-target effects. Can we create a map that allows us to infer the impact of mutations and find better therapeutics? Each step of GPCR regulation (from transcription to isoform splicing to post-translational modification and degradation) is a potential intervention point. We are working to understand the role of GPCRs in pediatric cancer and define new therapeutic vulnerabilities.
Intrinsically Disordered Protein Regions
The canonical structure-function paradigm was transformational for drug discovery, but we now recognize that only 60% of protein sequence space is highly structured. The remainder is comprised of intrinsically disordered protein regions (IDRs). IDRs are found in several clinically relevant protein families, including kinases, zinc fingers, and GPCRs. Our laboratory is developing new experimental and computational methods to determine which parts of IDRs are critical for function and subcellular compartmentalization. New classes of cancer therapeutics are being designed (LYTACs, PROTACs, etc.), and a more thorough understanding of IDRs may contribute to the optimization and improvement of these approaches.
Another focus of our Lab has direct relevance for pediatric cancer – the construction of cancer dependency maps for pediatric tumors. In collaboration with Dr Charles Roberts, and an interdisciplinary team of scientists at St. Jude and beyond, we are working to identify the molecular basis of cancer vulnerabilities. This work has potential to open up a new horizon of therapeutic opportunities by identifying and understanding which dispensable genes are uniquely critical in pediatric cancers. In this manner, we hope to identify druggable targets.
Dr. Babu is one of the pioneers to establish data science-based approaches to reveal principles of biological systems. He was recruited to St. Jude in 2020 as the Endowed Chair in Biological Data Science in the Department of Structural Biology and Director of the Center of Excellence for Data-Driven Discovery. Dr Babu was awarded the 2019 EMBO Gold Medal for his contributions to the field of computational biology and was elected a fellow of the Academy of Medical Sciences, UK in 2021. He leads a prolific and innovative research team leveraging computational and experimental methods to study biological systems at different scales of complexity.
Computational biologists and experimentalists from all over the world, with expertise in areas ranging from GPCR signaling, gene regulation, disordered proteins, cancer biology, Machine Learning, and evolution.
M. Madan Babu, PhD, FRSC
St. Jude Children Research Hospital