Cancer Data Science Pulse
A Quick Start Guide to Cancer Data Science for Clinical Oncology
Whether you are in the data science field, interested in developing computational solutions for clinical oncology, or a clinical researcher, we’ve curated a list of data sets, tools, and learning resources to showcase how these disciplines can and are working together to empower cancer research. Cancer data science can drive clinical oncology forward. For example, computer models can be developed to serve as a cancer patient’s digital twin to capture real-time dynamics to create predictive models and, one day, guide treatment decisions. Also, national sequencing efforts could equip clinicians with the knowledge of how a patient’s genome impacts their response to medications.
Explore Clinical and Biological Data Online
With so much data available, we’ve pulled together a list of links to data sets for common cancer types across some of our NCI Cancer Research Data Commons (CRDC) resources. Collectively, these resources offer access to more than one million files of experimental and clinical data from landmark NCI studies and other grantee projects. Clinical attributes are as follows:
- Environmental Exposure
You can explore high-level trends in these data sets directly from the online portals or analyze the data with the library of robust computational tools provided through NCI’s Cloud Resources.
|Cancer Site||Genomic Data Commons
(Genomic and clinical data)
|Proteomic Data Commons
(Proteomic and clinical data)
Imaging Data Commons
(Medical imaging, digital pathology, and clinical annotations)
|Lung||1,251 Cases||342 Cases||4,728 Cases|
|Breast||1,251 Cases||247 Cases||12,587 Cases|
|Colorectal||639 Cases||194 Cases||1,662 Cases|
|Kidney||792 Cases||119 Cases||1,373 Cases|
|Pancreas||311 Cases||144 Cases||481 Cases|
Find Bioinformatics Tools for Predictive Oncology
In addition to what the CRDC offers, there are many analytical tools and pipelines to help you mine and extract meaningful insights from clinical data. We’ve added a few selections from our partners focused on supporting predictive and precision oncology analysis, but you can find more tools through the “Resources for Researchers” search engine.
- Accelerating Therapeutics for Opportunities in Medicine (ATOM) Consortium: This public-private consortium has developed the ATOM Modeling PipeLine (AMPL), an open source, free-to-use software for building and sharing models that advance in silico drug discovery.
- Joint Design of Advanced Computing Solutions for Cancer (JDACS4C): One of the pilot studies from this cross-agency collaboration between NCI and the Department of Energy developed predictive artificial intelligence (AI) and machine learning models of drug responses in pre-clinical models of cancer to improve and expedite the selection and development of new targeted therapies. Its tools are available online through their capabilities catalog.
- Informatics Technology for Cancer Research (ITCR): This trans-NCI program supports investigator-initiated and research-driven informatics tool development. The ITCR portfolio shares 15 resources for clinical research, including tools for integrating and analyzing electronic medical records, databases cataloging clinically actionable information for personalized cancer therapy, and education forums. Links to the tools and available tutorials are available online through their tools catalog. For more information about the opportunities through their new training program, read our recent blog post.
Learn About Cancer Data Science in Precision Oncology
If you’re new to the world of cancer data science and its application to clinical research, check out these introductory blog posts covering some of the basics and examples of innovative work made possible through the intersection of these disciplines. For regular updates on NCI’s cancer data science efforts, training events, and blog, subscribe to our weekly RSS.
Top 5 Data Technologies
Find out what these top technology buzzwords mean and how they are being applied now in the field of cancer research.Read now
An Introduction to Cloud Computing for Cancer Research
Get a bird’s-eye view of cloud computing and its application into cancer research, including tips for managing costs, access, and training to help advance precision medicine and cancer research.
Learn how to apply critical semantics concepts to make your research more findable, accessible, interoperable, and reusable.
Wrangling Data for Microbiome Research—Focus on QIIME 2
Read about a key NCI-supported bioinformatics tool called QIIME 2, which is helping us better understand the microbiome and its impact on disease.
Using Bioinformatics to Solve the Neoantigen Puzzle
Dr. Malachi Griffith, an associate professor of oncology and genetics, shares how his tinkering with computers, bioinformatics, and genomics is helping him understand the complexities of this promising research area. If successful, neoantigen-based cancer therapies could prove to be the pinnacle of personalized medicine.
Blending Weather Forecasting with Team Science Leads to Advances in Cancer Immunotherapy
Dr. Elana J. Fertig, an associate professor of oncology, biomedical engineering, and applied mathematics/statistics, describes how she is using AI, blended with spatial and single cell technologies, to better understand how cancer will respond to treatment. Predicting the changes that occur in the tumor during treatment may someday enable us to select therapies in advance, essentially stopping the disease in its tracks before it reaches the next stage in its evolution.
Are we missing resources you would want to see? We may already have it. Leave a comment and we’ll follow-up with additional information.