Cancer Data Science Pulse

The Cancer Data Science Pulse blog provides insights on trends, policies, initiatives, and innovation in the data science and cancer research communities from professionals dedicated to building a national cancer data ecosystem that enables new discoveries and reduces the burden of cancer.

Did you ever wonder what goes into making data ready for analysis by researchers around the world? In this video blog, meet "Datum," a single speck of genomic data chronicling how NCI supports cancer research by bringing data to life.

Federated learning (FL) might well be the next paradigm shift in democratizing vast amounts of data from many data sources for use in cancer research. FL offers a decentralized, but collective, approach to using data to better understand cancer.

We’re continuing our blog series on data visualizations with a look at circular heatmaps and biplots. This blog features images and tips from Drs. Arashdeep Singh and Sridhar Hannehalli of NCI’s Center for Cancer Research.

Are you researching genomic abnormalities? Bioinformatician Deena Bleich gives an overview of the online tool, “Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer,” and how it can help you analyze genomic data.

In this blog, we’re spotlighting how researchers can leverage FireCloud, one of NCI’s Cloud Resources, for accessing data, running analysis, and collaborating with others in the cancer research community.

Read the blogs that topped our charts in 2022 and see if your favorite made #1!

Interested in a data science fellowship? Whether you’re considering data science as a career, you’re a new graduate, or you’ve been working in academia or industry for many years, NCI has fellowship opportunities to help you grow your career.

Learn more about new streamlined access to broad-use data sets within the database of Genotypes and Phenotypes (dbGaP).

In honor of National Lung Cancer Awareness Month, we’re highlighting the “data deets” (details) for the National Lung Screening Trial, a large-scale effort that collected imaging data for more than 53,000 heavy smokers. In this blog, we’ll cover the research that drove this data, specific metrics about the data set, how to access it, and some of the exciting data science projects using the data.

Ever wonder what it’s like to work on a data ecosystem? Meet software engineer Ming Ying, and website specialists Hannah Stogsdill and Ambar Rana, as they describe what it’s like to design, develop, implement, and maintain NCI’s Integrated Canine Data Commons.