Cancer Data Science Pulse

A Quick Start Guide to Cancer Data Science for Clinical Oncology

Whether you are in the data science field, interested in developing computational solutions for clinical oncology, or a clinical researcher, we’ve curated a list of data sets, tools, and learning resources to showcase how these disciplines can and are working together to empower cancer research. Cancer data science can drive clinical oncology forward. For example, computer models can be developed to serve as a cancer patient’s digital twin to capture real-time dynamics to create predictive models and, one day, guide treatment decisions. Also, national sequencing efforts could equip clinicians with the knowledge of how a patient’s genome impacts their response to medications. 

Explore Clinical and Biological Data Online

With so much data available, we’ve pulled together a list of links to data sets for common cancer types across some of our NCI Cancer Research Data Commons (CRDC) resources. Collectively, these resources offer access to more than one million files of experimental and clinical data from landmark NCI studies and other grantee projects. Clinical attributes are as follows:

  • Demographics
  • Diagnoses
  • Treatments
  • Environmental Exposure

You can explore high-level trends in these data sets directly from the online portals or analyze the data with the library of robust computational tools provided through NCI’s Cloud Resources.

Cancer Site Genomic Data Commons
(Genomic and clinical data)
Proteomic Data Commons
(Proteomic and clinical data)

Imaging Data Commons

(Medical imaging, digital pathology, and clinical annotations)

Lung 1,251 Cases 342 Cases 4,728 Cases
Breast 1,251 Cases 247 Cases 12,587 Cases
Colorectal 639 Cases 194 Cases 1,662 Cases
Kidney 792 Cases 119 Cases 1,373 Cases
Pancreas 311 Cases 144 Cases 481 Cases

Having trouble understanding what a particular term means? Our semantics resources and services provide definitions on common data elements and standard data ontologies like CDISC.

Find Bioinformatics Tools for Predictive Oncology

In addition to what the CRDC offers, there are many analytical tools and pipelines to help you mine and extract meaningful insights from clinical data. We’ve added a few selections from our partners focused on supporting predictive and precision oncology analysis, but you can find more tools through the “Resources for Researchers” search engine.

  • Accelerating Therapeutics for Opportunities in Medicine (ATOM) Consortium: This public-private consortium has developed the ATOM Modeling PipeLine (AMPL), an open source, free-to-use software for building and sharing models that advance in silico drug discovery.
  • NCI-Department of Energy (DOE) Collaboration: One of the pilot studies from this cross-agency collaboration between NCI and DOE developed predictive artificial intelligence (AI) and machine learning models of drug responses in pre-clinical models of cancer to improve and expedite the selection and development of new targeted therapies. Its tools are available online through their capabilities catalog.
  • Informatics Technology for Cancer Research (ITCR): This trans-NCI program supports investigator-initiated and research-driven informatics tool development. The ITCR portfolio shares 15 resources for clinical research, including tools for integrating and analyzing electronic medical records, databases cataloging clinically actionable information for personalized cancer therapy, and education forums. Links to the tools and available tutorials are available online through their tools catalog. For more information about the opportunities through their new training program, read our recent blog post.

Learn About Cancer Data Science in Precision Oncology

If you’re new to the world of cancer data science and its application to clinical research, check out these introductory blog posts covering some of the basics and examples of innovative work made possible through the intersection of these disciplines. For regular updates on NCI’s cancer data science efforts, training events, and blog, subscribe to our weekly RSS.


Top 5 Data Technologies

Find out what these top technology buzzwords mean and how they are being applied now in the field of cancer research.

Read now


An Introduction to Cloud Computing for Cancer Research

Get a bird’s-eye view of cloud computing and its application into cancer research, including tips for managing costs, access, and training to help advance precision medicine and cancer research.

Read now


Semantics Primer

Learn how to apply critical semantics concepts to make your research more findable, accessible, interoperable, and reusable.

Read now


Wrangling Data for Microbiome Research—Focus on QIIME 2

Read about a key NCI-supported bioinformatics tool called QIIME 2, which is helping us better understand the microbiome and its impact on disease.

Read now


Using Bioinformatics to Solve the Neoantigen Puzzle

Dr. Malachi Griffith, an associate professor of oncology and genetics, shares how his tinkering with computers, bioinformatics, and genomics is helping him understand the complexities of this promising research area. If successful, neoantigen-based cancer therapies could prove to be the pinnacle of personalized medicine.

Read now


Blending Weather Forecasting with Team Science Leads to Advances in Cancer Immunotherapy

Dr. Elana J. Fertig, an associate professor of oncology, biomedical engineering, and applied mathematics/statistics, describes how she is using AI, blended with spatial and single cell technologies, to better understand how cancer will respond to treatment. Predicting the changes that occur in the tumor during treatment may someday enable us to select therapies in advance, essentially stopping the disease in its tracks before it reaches the next stage in its evolution.

Read now

Are we missing resources you would want to see? We may already have it. Leave a comment and we’ll follow-up with additional information.

Older Post
Datum and Artificial Intelligence—A Perfect Match
Newer Post
Next Generation Artificial Intelligence: New Models Help Unleash the Power of AI

Leave a Reply

Vote below about this page’s helpfulness.

Your email address will not be published.


Enter the characters shown in the image.

Thank you for letting me know that computer models are seen to guide treatment decisions one day. My friend wants her cancer treatment to be effective. I think it's best for her to seek guidance from an oncology specialist.
We’re glad you found the information helpful. Data science contributes greatly to the field of cancer research, and if you’d like to learn more, please check out our Cancer Data Science Pulse blog where we frequently explore this topic further:
I read your blog. I found it very informative. I am a big fan of your blogs. I feel the blog aligns perfectly with our services. We are providing data science courses with real-work experience which is ideal for those who wish to have a career transition or start a fresh career path in data science along with a 100% job assurance commitment visit our website <a href=> Data Science Course in pune </a>. These courses are wonderful for professionals.
Thank you for your interest in our blogs, we’re glad you’re a fan. Data scientists are integral to the future of cancer research and precision medicine!
"Cancer Data Science is playing an increasingly vital role in the field of Clinical Oncology, providing new insights and opportunities for precision medicine. With the use of advanced analytical methods and large amounts of data, we can now better understand the underlying biology of cancer, identify novel therapeutic targets and predict patient outcomes, ultimately leading to improved patient care and outcomes. We are excited about the potential for Cancer Data Science to revolutionize the way we diagnose and treat cancer."

We’re excited about how cancer data science is advancing the ways we treat and diagnose cancer too! We look forward to continuing to provide our readers with helpful resources and current news about the field of data science for cancer research. Be sure to check out our other blogs too!
Thanks for sharing the details of this website
We glad it’s been helpful for you!