Skip to main content

St. Jude researchers develop powerful interactive tool to mine data from cancer genome

St. Jude Children’s Research Hospital has developed a web-based application to advance pediatric cancer research, collaboration and clinical care through enhanced exploration of the pediatric cancer genome

Memphis, Tennessee, December 29, 2015

Jinghui Zhang and Xin Zhou share ideas about web-based application proteinpaint

Jinghui Zhang, Ph.D. and Xin Zhou, Ph.D. share ideas about web-based application ProteinPaint.

St. Jude Children’s Research Hospital scientists have developed a web application and data set that gives researchers worldwide a powerful interactive tool to advance understanding of the mutations that lead to and fuel pediatric cancer. The freely available tool, called ProteinPaint, is described in today’s issue of the scientific journal Nature Genetics.

ProteinPaint provides users with a gene-by-gene snapshot of mutations from pediatric cancer that alters genetic instructions for encoding proteins. The application provides critical information unavailable with existing visualization tools. For example, ProteinPaint shows whether mutations are present at diagnosis or just at relapse, or whether mutations occur in almost every cell (germline) or just cancer cells (somatic).

ProteinPaint’s novel interactive infographics also let researchers see at a glance all mutations in individual genes and their corresponding proteins, including detailed information about mutation type, frequency in cancer subtype and location in the protein domain. That information provides clues about how a change might contribute to cancer’s start, progression or relapse.

“Each day brings new information about mutations that drive human cancer. Novel tools are essential to help scientists use this wealth of genomic data to advance research and find new cures,” said corresponding author Jinghui Zhang, Ph.D., chair of the St. Jude Department of Computational Biology. “We developed ProteinPaint as an intuitive tool any scientist can easily use to explore the vast amount of information now available on cancer genomics.”

There are multiple types of mutations that disrupt the structure of protein-coding genes and lead to cancer. ProteinPaint integrates mutation information from multiple data sets, which boosts its power as a research tool. The application incorporates findings from the St. Jude Children’s Research Hospital—Washington University Pediatric Cancer Genome Project, the National Cancer Institute’s Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative and other published pediatric cancer studies.

ProteinPaint currently includes information on almost 27,500 mutations discovered in more than 1,000 pediatric patients with 21 cancer subtypes. The data will be updated as new information is published.

ProteinPaint currently includes information on almost 27,500 mutations

ProteinPaint’s novel interactive infographics let researchers see at a glance all mutations in individual genes and their corresponding proteins.

The application’s developers use the curated data to “paint” or overlay detailed, annotated information about each mutation on the affected protein. First author Xin Zhou, Ph.D., a St. Jude senior bioinformatics research scientist, developed the infographics to display the range of genomic information in an intuitive and interactive format. A click of the mouse gives users additional details about the mutations, including the pediatric cancer subtype where the change has been validated, and a link to the publication.

The application also “paints” RNA-sequencing data from 928 pediatric tumors from 36 subtypes to track how mutations affect gene expression. While whole genome sequencing reveals the complete DNA makeup of an organism, RNA sequencing provides a snapshot of how instructions encoded in DNA are transcribed into RNA molecules. The information is essential for developing and delivering individualized cancer therapies.

“ProteinPaint’s focus on pediatric cancer and presentation of mutations at the gene level complements existing cancer genome data portals,” Zhang said. “For St. Jude, the application is the foundation for developing a global reference database for information about pediatric cancer.”

Zhou added that the ProteinPaint software has the potential to help researchers studying other disorders, including sickle cell disease, that involve a mutation that affects protein function.

ProteinPaint is available at no cost to academic researchers who are also free to use the tool to analyze their own data. The application also lets researchers compare information about pediatric and adult cancer genomes by providing a parallel view of data COSMIC, the world’s largest database of somatic mutations, primarily from adult cancer. Such comparisons can help researchers understand and interpret the significance of rare mutations.

The tool was used to study the role played by germline mutations in pediatric cancer. The research appeared in the November 19 edition of the New England Journal of Medicine. More information and a ProteinPaint demonstration are available on the St. Jude PeCan Data Portal.

The other study authors are Michael Edmonson, Mark Wilkinson, Aman Patel, Gang Wu, Yu Liu, Yongjin Li, Zhaojie Zhang, Michael Rusch, Jared Becksfort and James Downing, all of St. Jude; and Matthew Parker, formerly of St. Jude.

The research was funded in part by the Pediatric Cancer Genome Project, including Kay Jewelers, a lead sponsor; a grant (CA021765) from the National Cancer Institute at the National Institutes of Health; and ALSAC.

About the Pediatric Cancer Genome Project and TARGET

Since its launch in 2010, the complete cancer and normal genomes of more than 800 pediatric cancer patients have been sequenced by the Pediatric Cancer Genome Project. The project has produced groundbreaking discoveries in a variety of pediatric cancers and the degenerative disease commonly known as Lou Gehrig’s disease. It has also produced new computational tools that have benefited the broader field of genomic medicine. The TARGET initiative is the largest federally funded effort to understand the genetic changes that underlie pediatric cancer.

St. Jude Children's Research Hospital

St. Jude Children's Research Hospital is leading the way the world understands, treats and cures childhood cancer and other life-threatening diseases. It is the only National Cancer Institute-designated Comprehensive Cancer Center devoted solely to children. Treatments developed at St. Jude have helped push the overall childhood cancer survival rate from 20% to 80% since the hospital opened more than 50 years ago. St. Jude shares the discoveries it makes, and every child saved at St. Jude means doctors and scientists worldwide can use that knowledge to save thousands more children. To learn more, visit or follow St. Jude on social media at @stjuderesearch.