Cancer Data Science Pulse
Population Level Pilot: Population Information Integration, Analysis, and Modeling for Precision Surveillance
NCI continues to identify and link external data sources with SEER data to enable the expansion of longitudinal data to form patient trajectories and to support modeling efforts. To inform the incorporation of those additional sources, NCI compiled an extensive breast cancer recurrence data dictionary to identify recurrence-related data elements across multiple sources, including pathology, radiology, pharmacy, biomarkers, procedures, comorbidities, patient-generated information, and radiation oncology. The Population Level Pilot Team is also collaborating with clinical experts to construct research agendas that will be used to create disease-specific use cases for scalable predictive modeling and analytics using a variety of integrated datasets.
The NCI-DOE Population Level Pilot collaboration will enhance the SEER Program through automated data abstraction, linkage and visualization of data sources, and predictive modeling. Preliminary results indicate the capability to use deep learning methods to automate extraction of selected registry data elements, which will enhance cancer surveillance efforts and reduce registry workload. Additionally, modeling of longitudinal patient trajectories will allow for review of initial breast cancer treatments and examine novel patterns. This application of data-driven modeling seeks to answer key clinical oncology questions to empower clinician decision-making tools, which can improve patient treatment selection and health outcomes.