Leveraging computational methods to understand, define, and predict the recurrent landscape of pediatric solid tumors
Significant progress has been made in the treatment of pediatric cancer, but outcomes have improved much more dramatically for hematological malignancies than for solid tumors. One of the most challenging aspects of pediatric solid tumors is the propensity for these diseases to relapse following initial therapeutic intervention and the poor clinical outcomes after relapse. Our lab is interested in determining what causes relapse, and what we can do to prevent or delay this phenomenon. We develop and apply computational tools and machine learning to decipher underlying causes, and potential vulnerabilities.
Understanding the driving forces behind pediatric solid tumor relapse is essential to improving therapies and outcomes. We are particularly interested in understanding whether there is a specific subpopulation of tumor cells responsible for relapse and what controls their behavior. We’ve already identified, using whole genome sequencing, that minor subpopulations of malignant cells often harbor few mutations compared to the bulk tumor –and that these populations seem to survive chemotherapy and expand to mega-clones upon relapse. We utilize 10x single-cell RNA-seq technology to interrogate these subpopulations. Adapting technology used in information retrieval for application in single cell analysis, we can achieve very accurate subpopulation estimates and clustering strategies. This allows us to reveal biological cellular states.
Translating our findings into practice will require the identification of a standard clinical measurement to identify which patients are more likely to relapse. Our collaborative work suggests immature cell subpopulations are responsible for relapse, and that patients with these subpopulations are at greater risk of disease recurrence. So, how do we detect and describe these populations? Single-cell approaches are powerful, but not scalable to individual patients. Our group has developed a method to use clinically available DNA-methylation data and an AI deep-learning based approach, to infer the gene promoter activity and expression level across the transcriptome. Ultimately, we hope to develop novel biomarkers based on these methylation patterns to inform clinical applications.
DNA methylation is an important mechanism for establishing a cell’s developmental identity. We developed a recursive partitioning approach that is robust to technical artifacts from single-nucleotide polymorphisms at the CpG dinucleotides (CpG-SNPs) and is a powerful method to reveal co-methylation blocks using whole-genome bisulfide sequencing.
Whole genome sequencing (WGS) is increasingly used in both research and clinical settings. The Variant Call Format (VCF) specification is a widely adopted file format for genetic variation data.
Xiang Chen, PhD
Associate Member, Computational Biology
Department of Computational Biology
MS 1135, Room IA6043
St. Jude Children's Research Hospital