Cancer Data Science Pulse

Data Sets

In this latest Data Science Seminar, Jim Lacey, Ph.D., M.P.H., shares the lessons he learned in transitioning a large cancer epidemiology cohort study to the cloud, including the importance of focusing on people and processes as well as technology. Project managers, principal investigators, co-investigators, data managers, data analysts—really anyone who is part of a team that wants to use the cloud or cloud-based resources for their studies—should attend.

The diversity, complexity, and distribution of data sets present an ongoing challenge to cancer researchers looking to perform advanced analyses. Here we describe the Cancer Genomics Cloud, powered by Seven Bridges, an NCI Cloud Resource that’s helping to bring together data and computational power to further advance cancer research and discovery.

On Wednesday, September 22, 2021, Yanjun Qi, Ph.D., from the University of Virginia, will present “AttentiveChrome: Deep Learning for Predicting Gene Expression from Histone Modifications,” in the kickoff of the Fall Data Science Seminar Series. This blog offers insight on Dr. Qi’s research and why this topic is important to her.

To the NCI Cancer Research Data Commons, cloud computing means three words: NCI Cloud Resources. These are real-world examples of making data accessible and available to all cancer researchers. Kicking off the first of a four-part blog series, the NCI Cloud Resources share their origin story and the problems that cloud computing could solve in cancer research.

On October 20th, NCI launched the Imaging Data Commons (IDC), the latest data repository to be offered within the Cancer Research Data Commons (CRDC) infrastructure. Through the IDC, both researchers and clinicians will have access to a wide range of cancer-related images, including radiology and pathology imaging data, as well as their accompanying metadata.