Cancer Data Science Pulse

Data Commons

To the NCI Cancer Research Data Commons, cloud computing means three words: NCI Cloud Resources. These are real-world examples of making data accessible and available to all cancer researchers. Kicking off the first of a four-part blog series, the NCI Cloud Resources share their origin story and the problems that cloud computing could solve in cancer research.

On May 24, CBIIT welcomed Dr. Jill Barnholtz-Sloan as the new associate director for Informatics and Data Science. In this latest Q&A blog, Dr. Barnholtz-Sloan tells a little about herself, including what brought her to CBIIT, what keeps her centered, and what makes her most proud.

Did you ever wonder what goes into making data ready for analysis by researchers around the world? Introducing “Datum.” This single speck of data was conceptualized to show how NCI’s Center for Biomedical Informatics and Information Technology supports cancer research by bringing data to life.

“Count Me In” (CMI) is a unique project that gives patients an opportunity to share their cancer-related data directly with scientists. According to Corrie Painter, associate director of CMI, this is a largely untapped but vital part of data science. Here she describes the project and what it could mean for future research efforts.

On October 20th, NCI launched the Imaging Data Commons (IDC), the latest data repository to be offered within the Cancer Research Data Commons (CRDC) infrastructure. Through the IDC, both researchers and clinicians will have access to a wide range of cancer-related images, including radiology and pathology imaging data, as well as their accompanying metadata.

Dr. Tony Kerlavage, director of NCI’s Center for Biomedical Informatics and Information Technology (CBIIT), sat down to discuss one key component of racial inequality, the issue of health disparities, as it relates to Big Data. As noted by Dr. Kerlavage, representing our diverse U.S. population in research and in the workforce are key, but we also need better data.

Naturally occurring cancers in dogs share similarities with cancer that occurs in humans. The Integrated Canine Data Commons (ICDC), a cloud-based repository of canine cancer data, includes a variety of molecular, clinical, pharmacological, and medical imaging information from pet dogs. Such comparative oncology findings offer researchers greater insight into how best to diagnose, treat, and prevent cancer—in both people and pets.

This new blog installment shines a spotlight on the staff who are working to turn data and IT resources into solutions for addressing data-driven cancer research. This spotlight features Sherri de Coronado, program manager in the CBIIT Cancer Informatics Branch.

NCI initiatives are accumulating a wealth of data from the fields of genomics, proteomics, single-cell, radiology, molecular imaging, clinical findings, and more. The newly awarded Cancer Data Aggregator (CDA) is currently being designed and developed to allow scientists to crosstalk among these very diverse data sets, facilitating interoperability not only within the Cancer Research Data Commons but throughout the larger data ecosystem.

Pooling data from numerous sources strengthens the power of the information, but only if it can be meaningfully connected. Dr. Melissa Haendel, Director of the Translational and Integrative Sciences Laboratory, Oregon State University (OSU), and Principal Investigator for the NCI Center for Cancer Data Harmonization, and Julie McMurry, Associate Director of the Translational and Integrative Sciences Laboratory, OSU, describe the basics of harmonization and how it can help in wrangling massive amounts of data to make them more valuable to research.