Skip to main content

In Search of Hidden Gems: Data-Driven Discovery

With the world’s top talent and technology in place, St. Jude is using data-driven discovery to better understand childhood cancer and to identify promising drug targets.

By Jane Langille; Photo by MRC Laboratory of Molecular Biology

Imagine searching for a sparkling diamond amid a beach of glittering sea glass. Scientists today face a similar task as they comb vast amounts of data in pursuit of lifesaving discoveries.

The data generated as part of one scientific project can be mind boggling — exceeding the capacity of multiple computers. How does a researcher find one elusive gem among terabytes of data?

“What fascinates me the most about big data is that we now have more information than any one person can really think about,” says M. Madan Babu, PhD, who holds the endowed chair in Biological Data Science in the Department of Structural Biology and directs the new Center of Excellence for Data-Driven Discovery at St. Jude Children’s Research Hospital.

“The key to navigate this complex landscape of data is to ask the right scientific questions and bring multiple disciplines together,” he says. “With clinical scientists, chemists, physicists, statisticians, biologists, structural biologists and computer scientists united under the same mission, there’s no better place than St. Jude to take on a task of this magnitude.”

As a boy growing up in India, Babu was captivated by computer science and biotechnology. He went on to build a distinguished career in computational biology and bioinformatics. In fact, he was one of the pioneers to establish data science–based approaches to reveal the basic principles of biological systems.

Babu was inspired to use his knowledge and skills to change the lives of children with catastrophic diseases, including cancer. So, in July 2020, he accepted a position where his work might reveal deep biological insights that could one day lead to new treatments and cures.

Shape determines function

Most of the work inside our cells is performed by proteins. The shape of a protein determines how it functions. Some proteins change shape in response to messages received from other molecules through receptors. These receptors, a group of proteins that sit on the cell membrane, change shape themselves in response to messages received from outside of the cell, and they transmit those messages to make changes inside the cell.

G-protein coupled receptors (GPCRs) are the largest family of protein receptors in the human genome. More than 800 GPCRs are expressed throughout the body. They play essential roles in our immune, hormone, cardiac and respiratory systems, as well as in our ability to smell and taste and other functions.

Each GPCR binds to one or very few specific molecules, fitting together like a lock and key. Scientists must determine the shape of these receptors and learn how other molecules interact with them. Only then can the scientists design drugs that alter the commands affecting the proteins’ shapes and functions.

GPCRs are excellent drug targets because they sit on cell membranes. One-third of all approved drugs target GPCRs, but scientists have studied only about 150 of the 800 that are known.

Surprisingly, just a few of the FDA-approved cancer drugs target GPCRs. Why are they not targeted?

The answer, Babu says, is that the role of GPCRs in cancer has only been looked at in the last few years. He and his colleagues are using data-driven approaches to reveal how GPCRs drive cancer.

M. Madan Babu

Together, we are taking big steps to advance the field of pediatric cancer —working on long-term, complex problems that can’t be solved with quick fixes.

M. Madan Babu, PhD 


A closer look

St. Jude structural biologists can see farther into cells than ever before. They use the world’s most advanced tools to reveal once unknown molecular structures. Those tools include X-ray crystallography, single-molecule imaging, mass spectroscopy and high-resolution nuclear magnetic resonance spectroscopy.

Now, Babu brings data science approaches to St. Jude, providing new insights into variations in molecular structures.

He and his colleagues have integrated information on protein structures with human genetics and transcriptomics data. They found multiple GPCR gene variants that give rise to slightly altered shapes of the same receptors — changing the nature of the signaling messages. The findings may explain why side effects occur with currently available targeted therapies that address only one variant. The findings may also explain why some drugs work well in cell models but fail in animal models — hence, never making their way to patients.

At St. Jude, Babu aims to understand how different families of GPCRs may play a role in pediatric cancer, because many GPCRs are key players in the immune system.

He is investigating a family of GPCRs called chemokine receptors. They regulate how immune cells travel to cancer cells and interact with them. Targeting chemokine receptors with drugs could help the immune system better identify and attack the cancer cells.

Diamonds and dependencies

“Initiatives such as the Pediatric Cancer Genome Project [led by James R. Downing, MD, president and CEO of St. Jude] have provided an unparalleled view of the landscape of where things have gone wrong with cancer,” Babu says.

St. Jude researchers are capturing these — and other — massive amounts of data and are making that data available to researchers across the world through the St. Jude Cloud.

“This presents unprecedented opportunities to discover mechanisms and events that lead to altered behavior at the molecular level, eventually paving the way to fix them,” Babu says.

That’s where big data science comes in. Babu and his team develop new methods to analyze and mine data from many sources, including DNA sequences, gene expression profiles, proteomics, protein structures and details about the shapes of compounds that could lead to drug candidates.

Virtual drug discovery

St. Jude researchers are using 3-D computer modeling techniques to screen billions of chemical compounds to see if one might lock into proteins of interest. Recent advances in computing power have made it possible to whittle that list to a manageable size.

“Once you know the shape of the molecule, you want to discover a compound that will bind and perturb its function. That compound will eventually become a drug,” Babu explains. “But the chemical space is vast. There are many different compounds, and we don’t know which one is going to be able to bind to it.

“We can’t do an experiment to test a billion compounds,” he continues. “How could we even test that many?”

That’s where, again, computers come in handy.

“At St. Jude, we have the technological expertise and computing power to do just that,” he says.

Babu is particularly interested in revealing the unique biological weak points in cancer cells that are not present in normal cells. These weaknesses are known as dependencies. His team strives to discover mechanisms involving previously unstudied GPCRs, also called orphan receptors, that are dependencies in pediatric cancers.

“We don’t know yet what messages many of these orphan receptors respond to,” Babu says. “We aim to identify which ones play important roles in cancer. For this, we use our analysis methods in combination with the thoughtfully set-up core facilities and other resources within St. Jude to investigate them further.”

Babu says he hopes his discoveries about dependent GPCRs can drive virtual screening efforts to reveal such compounds.

Data science approaches can also identify drugs already approved for other health conditions that affect the same dependencies cancer cells use to survive. That means it may be possible to repurpose drugs already approved for other health conditions to treat pediatric cancers driven by the same variants.

M. Madan Babu, PhD

M. Madan Babu, PhD


Collaborative culture

Babu says he’s excited to be working alongside other world-class experts from diverse fields, united under the hospital’s inspiring mission. St. Jude is paving the way for world-leading research in data science.

Scientists from the departments of Biostatistics, Chemical Biology and Therapeutics, Structural Biology, Cell and Molecular Biology, Developmental Neurobiology, Immunology, Infectious Diseases, Pharmaceutical Sciences, Cancer Biology, Genetics, Computational Biology, and Bioinformatics are uniting to share their knowledge and define the data science culture.

“This culture that breaks down scientific silos is remarkably rare,” Babu says. “It creates a diverse and unique intellectual ecosystem that enables us to tackle some of the most fundamental problems in biomedical data science.”

Babu says he recognized St. Jude was exceptional during his first visit in 2012, when he delivered the Danny Thomas Lecture, a prestigious lecture series named after the hospital’s founder.

“Its world-class core facilities, shared resources and commitment to cutting-edge science foster collaboration across multiple departments,” he says.

“Together, we are taking big steps to advance the field of pediatric cancer — working on long-term, complex problems that can’t be solved with quick fixes. There is no better place than St. Jude to do this.”


Donate Now Promise Archive

More articles from this issue