Skip to main content
National Cancer Institute Center for Biomedical Informatics and Information Technology
  • Data Sharing
    • Policy Guidance
    • Submitting Data
    • Accessing Data
    • Genomic Data Sharing
      • About the Genomic Data Sharing (GDS) Policy
      • Key Documents
      • Preparing Genomic Data
      • Extramural Grantees
      • Non-NCI Funded Investigators
      • Intramural Investigators
      • Accessing Genomic Data
      • Genomic Data Sharing Policy Contact Information
  • Collaborations
    • APOLLO Network
    • ARPA-H BDF Toolbox
    • Cancer Research Data Commons
    • Childhood Cancer Data Initiative
    • CIMAC-CIDC Network
    • Clinical Trials Reporting Program
    • Federated Learning
    • NCI-Department of Energy Collaboration
    • NCI-Molecular Analysis for Therapy Choice Trial Network
    • Real-World Data
    • U.S.-EU Artificial Intelligence Administrative Arrangement
  • Resources
    • NCI Data Catalog
    • Cancer Vocabulary
      • CDISC Terminology
      • FDA Terminology
      • NCPDP Terminology
      • Pediatric Terminology
    • Metadata for Cancer Research
    • Informatics Technology for Cancer Research (ITCR) Tools
  • Training
    • Learn About Cancer Data Science
      • Generating and Collecting Data
      • Cleaning Data
      • Exploring and Analyzing Data
      • Predictive Modeling
      • Visualizing Data
      • Sharing Data
    • Improve My Data Science Skills
      • Cancer Data Science Course
      • Training Guide Library
  • News & Events
    • Cancer Data Science Pulse Blog
    • News
    • Events
      • Data Science Seminar Series
    • Jobs and Fellowships
  • Funding
  • About
    • Contact CBIIT
    • Organization
    • CBIIT Director
    • NCI CIO
    • Staff Directory
    • Application Support
  • Data Sharing
    • Policy Guidance
    • Submitting Data
    • Accessing Data
    • Genomic Data Sharing
    • About the Genomic Data Sharing (GDS) Policy
    • Key Documents
    • Preparing Genomic Data
    • Extramural Grantees
    • Non-NCI Funded Investigators
    • Intramural Investigators
    • Accessing Genomic Data
    • Genomic Data Sharing Policy Contact Information
  • Collaborations
    • APOLLO Network
    • ARPA-H BDF Toolbox
    • Cancer Research Data Commons
    • Childhood Cancer Data Initiative
    • CIMAC-CIDC Network
    • Clinical Trials Reporting Program
    • Federated Learning
    • NCI-Department of Energy Collaboration
    • NCI-Molecular Analysis for Therapy Choice Trial Network
    • Real-World Data
    • U.S.-EU Artificial Intelligence Administrative Arrangement
  • Resources
    • NCI Data Catalog
    • Cancer Vocabulary
    • CDISC Terminology
    • FDA Terminology
    • NCPDP Terminology
    • Pediatric Terminology
    • Metadata for Cancer Research
    • Informatics Technology for Cancer Research (ITCR) Tools
  • Training
    • Learn About Cancer Data Science
    • Generating and Collecting Data
    • Cleaning Data
    • Exploring and Analyzing Data
    • Predictive Modeling
    • Visualizing Data
    • Sharing Data
    • Improve My Data Science Skills
    • Cancer Data Science Course
    • Training Guide Library
  • News & Events
    • Cancer Data Science Pulse Blog
    • News
    • Events
    • Data Science Seminar Series
    • Jobs and Fellowships
  • Funding
  • About
    • Contact CBIIT
    • Organization
    • CBIIT Director
    • NCI CIO
    • Staff Directory
    • Application Support
Illustration of the six stages of the cancer data science lifecycle, formatted in a cyclical pattern that reads clockwise. From the top, the first stage of the lifecycle is "data generation and collection." A researcher must identify and gather the data needed to address a problem. The second stage is "data cleaning." A researcher must fix discrepancies and handle missing values in his or her data. The third stage is "data exploration and analysis." A researcher must study the data, then form a hypothesis. The fourth stage is "predictive modeling." A researcher must use computational tools like machine learning models to make predictions with his or her data. The fifth stage is "data visualization." A researcher must communicate his or her data findings using interactive images, plots, and charts. The sixth and final stage is "data sharing." A researcher can accelerate discovery by making data available to others.

Follow Us on LinkedIn

Find training resources, opportunities to collaborate, advice from NCI data science experts, and ways to network and engage with the cancer data science community on our NCI Cancer Data Science page!

Join Our Community No

Follow Us

Follow us on linkedin Access our library of previously recorded Seminar Series webinars

More Information

About Contact CBIIT Sitemap Accessibility FOIA Privacy Policy & Security Comment Policy No Fear Act HHS Vulnerability Disclosure
U.S. Department of Health and Human Services National Institutes of Health National Cancer Institute USA.gov
NIH … Turning Discovery Into Health®