Skip to main content

Sharing Data Worldwide

St. Jude uses sophisticated technology to understand disease and share data with the global community.

By Mike O’Kelly

Scholars have shared their findings in academic journals for more than three centuries. The first journal was a 12-page pamphlet published in France in 1665. Since that time, scientific findings have filled leather-bound books and glossy magazines with a common goal of documenting and inspiring research.

Today’s scientists have a wealth of information available at their fingertips—a data revolution that is changing the landscape of medicine and research.

St. Jude Children’s Research Hospital is helping drive this new age of information-sharing, providing a roadmap for investigators complete with data, tools and resources.

In the Cloud

In early 2020, the institution marks the 10th anniversary of the St. Jude–Washington University Pediatric Cancer Genome Project (PCGP). This landmark initiative set the stage for huge leaps in our understanding of the origins of disease. Researchers sequenced the genomes of more than 800 patients to find the genetic factors behind the most common pediatric cancers.

Recognizing the need for a more sophisticated resource, the institution launched St. Jude Cloud in 2018. This resource allows users to study the world’s largest whole genome-based pediatric cancer repository and to analyze genomics data with advanced tools and visualization methods. The portal houses more than 10,000 whole genomes.

“Our idea was to enable users by developing tools and make it easier for institutions that do not have huge computing infrastructures to analyze data,” says Jinghui Zhang, PhD, St. Jude Computational Biology chair.

Protein Paint and Pecan PIE

One of these tools, a web application called ProteinPaint, provides snapshots of gene mutations from pediatric cancer that alter the genetic instructions for encoding proteins. This allowed researchers a first-time glimpse of these details.

St. Jude scientists also developed a free, online system to search the millions of variations in a patient’s genome. The portal—Pediatric Cancer Variant Pathogenicity Information Exchange (PeCanPIE for short)—can find mutations linked to inherited disorders. Just like St. Jude Cloud, the data is freely available.


The success of St. Jude Cloud has spawned other sharing tools targeted at specific diseases and patient research. Last year, the hospital unveiled PROPEL—one of the world’s largest collections of leukemia samples from children and adults. PROPEL samples are available at no cost to researchers with no obligation to collaborate. St. Jude sends investigators the samples along with the data.

The hospital teamed with the Howard Hughes Medical Institute in 2017 to create the Childhood Solid Tumor Network, the world’s largest collection of pediatric tumor samples. This free resource includes samples and data aimed at advancing treatment and research of pediatric solid tumors.

Survivorship and Sickle Cell

St. Jude Cloud also hosts two new portals for scientists studying childhood cancer survivorship and sickle cell disease.

The survivorship portal offers clinical and genomic data from thousands of childhood cancer survivors in the St. Jude Lifetime Cohort study. The portal includes de-identified data of more than 3,000 study participants along with genomic information as well as demographics, diagnoses, outcome and treatment.

“Our goal is to accelerate the rate of discovery in pediatric cancer survivorship research,” says Leslie Robison, PhD, St. Jude Epidemiology and Cancer Control chair. “We’re convinced the best way to achieve this is by using St. Jude Cloud to make the data from the St. Jude Lifetime Cohort Study available to the global research community.”

St. Jude has sequenced the DNA of about 500 patients with sickle cell disease to help determine why complications vary among individuals. This disorder is caused by the mutation of a single gene, but genetic modifiers influence outcomes such that some patients are sicker than others.

“We are examining potential associations between the sequencing data and specific clinical complications of our sickle cell disease patients to identify and study genes that affect their outcomes. Ultimately, understanding sickle cell disease modifier genes should lead to new, individualized therapies,” says Mitch Weiss, MD, PhD, St. Jude Hematology chair. “The portal is making some of that information available publicly so that other investigators can use the information to do similar work.”

Future Destinations

As technology and data sequencing evolve, St. Jude will continue to develop ways to understand the biology of disease while providing information freely to the world.

“We are on the path of increasing the dimensionality of our genomic data resources,” Zhang says. “In addition to a data repository, St. Jude Cloud will be more of an integration portal for all different types of data relevant to pediatric cancer.

Donate Now Promise Archive

More articles from this issue