Use Case

Findable, Accessible, Interoperable Childhood Cancer Data

Overview

When the National Cancer Institute (NCI) launched the Childhood Cancer Data Initiative (CCDI) to speed progress for children, adolescents, and young adults (AYA) with cancer, ESI answered the call. We rallied behind NCI’s aim to increase childhood and AYA cancer data sharing by developing scientific portals and pipelines to collect, harmonize, and make this rare and precious cancer data more readily available.

The Challenge

While childhood cancer was the leading cause of death by disease after infancy in the United States in 2024, it is still a rare cancer type when compared against the number of cancer cases nationwide. As a result, the data for childhood and AYA cancer is limited. Without a centralized infrastructure, the existing data was also siloed, making it less accessible and findable for researchers. NCI needed to make this critical research data more available to progress the diagnosis and treatment of childhood cancer. 

In 2019, NCI launched the CCDI with the aim of establishing more efficient ways to share and use childhood cancer data. In his first blog about the effort, Former NCI Acting Director, Dr. Douglas Lowy, shared that “[t]he CCDI will be the first of its kind in terms of the quality, size, and scope of data accessibility, making it an immensely valuable asset for young patients with cancer, their families, clinicians, and researchers. The initiative has the potential to chart a path that could change the course of how cancer in young people is prevented, diagnosed, and treated.” 

To align efforts with this vision, CCDI released 24 recommendations to bridge the gaps in childhood cancer data sharing, highlighting the need to:

  • create infrastructure to better collect data from patients.
  • make available data more findable and accessible.
  • standardize and integrate the data to show holistic progression of the disease and its treatment options.

To meet its goals, NCI needed a data ecosystem tailored specifically for childhood and AYA cancer data.

The Solution

Drawing from our experience building data commons for NCI’s Cancer Research Data Commons, ESI supported NCI CCDI in developing a series of scientific data portals and resources to help standardize and make childhood and AYA data accessible.

We supported the development, launch, and maintenance of the: 

  • CCDI Hub: This website serves as the entry point to find all the portals, tools, and resources developed by the initiative. Combined with a user-friendly dashboard, researchers can easily explore and find where to access childhood cancer data.
  • Childhood Cancer Clinical Data Commons (C3DC): This scientific application centralizes and harmonizes data from childhood cancer clinical trials. With its supported tools, the C3DC also allows researchers to analyze data within the interface, identifying patterns between cases in synthetic cohorts, and over time for longitudinal analysis.
  • Molecular Targets Platform (MTP): This scientific application helps researchers identify and prioritize therapeutic drugs. The application integrates several critical resources to tailor the information for childhood cancer research:
  • Open Targets platform (a publicly available platform to share available data on potential drug targets for disease).
  • FDA Pediatric Molecular Target Lists (two lists containing molecular targets associated with pediatric cancer growth and non-relevant drug targets).
  • Open Pediatric Cancer Project (a large, open source, multi-omic data set harmonized and analyzed by the Children’s Hospital of Philadelphia).
  • Childhood Cancer Data Catalog (CCDC): This informational website catalogs known childhood and AYA cancer research data sets, whether federally funded or not. With a robust search engine, researchers can quickly locate information about relevant data sets including the disease types, data formats, number of participants, and instructions for how to access or request access to the data.
  • CCDI Participant Index (CPI): This API service acts as a centralized index allowing researchers to connect data representing the same individual across different data sets, institutes, or systems. By cross-linking participants with a unified patient identifier, the CPI allows researchers to get a unified view of the participant’s data without compromising protected health or personally identifiable information.
  • CCDI cBioPortal Cancer Data Explorer (cBioPortal): This application provides researchers and clinicians with an intuitive platform to explore standardized genomic and clinical data. By eliminating the need for complex processing, it accelerates discovery, highlights patterns across genes and cancer types, and fosters collaboration. This accessibility drives faster, more impactful pediatric cancer research.

Throughout the development process, we leverage an approach that is:

  • Cloud-based: By building many of the resources on the AWS cloud, we are able to support the CCDI infrastructure as it scales. Furthermore, because the resources are on a similar architecture, we’ve created opportunities for future CCDI tools and resources to better integrate.
  • Secure: Our Development and DevOps processes leverage standard microservices that build in the necessary security measures for data protection and privacy right from the beginning.
  • Customizable: We prioritized listening, understanding, and collaborating with the CCDI working group and its user community to identify the critical tools and enhancements that would help accelerate cancer research.
  • Reusable: Using Bento, our open-source modular platform, as a base for many of our resources, we were able to reuse components to speed up the development process while promoting consistent user experience across the tools.
0
Ones and zeroes alternating in a binary pattern.
Data Sets Added to Catalog
0M
file icon
Files Added to CCDI Hub and C3DC
0K
Icon of a ball and stick representation of a molecule
Molecular Targets Added to MTP
0%
up arrow
System Uptime

The Results

Since the creation of CCDI’s resources, the public can reliably find and access:

  • 362 data sets cataloged in the CCDC.
  • 68,000 molecular targets captured by the MTP.
  • 1 million files of childhood cancer research data through the CCDI hub and C3DC.

Despite the high volume of data processing, CCDI’s systems have had no major failures with a 99.9% system uptime. With this rare cancer data more available, cancer researchers, citizen scientists, and data scientists can accelerate the prevention, diagnosis, and treatment for childhood and AYA cancer patients.

Ready to Make an Impact?

Learn more about Bento
Just as with C3DC, ESI’s open-source Bento software helped quickly, securely, and reliably launch several of the scientific portals in NCI’s Cancer Research Data Commons.

Explore ESI’s custom scientific software services
ESI offers services and custom software solutions to answer your research questions.

Team Members

Yizhen Chen
Software Engineer Technical Lead
Hannah Stogsdill
Senior User Experience Engineer (Contractor)
Wei Yu
Software Engineer Technical Lead

Connect and Share