Cancer Data Science Pulse

Data Standards

On May 24, CBIIT welcomed Dr. Jill Barnholtz-Sloan as the new associate director for Informatics and Data Science. In this latest Q&A blog, Dr. Barnholtz-Sloan tells a little about herself, including what brought her to CBIIT, what keeps her centered, and what makes her most proud.

Did you ever wonder what goes into making data ready for analysis by researchers around the world? Introducing “Datum.” This single speck of data was conceptualized to show how NCI’s Center for Biomedical Informatics and Information Technology supports cancer research by bringing data to life.

CBIIT’s May 19 Data Science Seminar Series speaker, Dr. Kristen Naegle, took the speed of computational biology, blended it with basic science know-how, and developed an algorithm that is proving to be remarkably effective in predicting kinase activity. Understanding kinases in oncology may help identify people who are more likely to respond (or not respond) to certain medications, further advancing precision medicine.

Dr. Charles Wang offers a sneak peek at his upcoming Data Science Seminar presentation, scheduled for April 7. His recent study provides guidance for choosing an appropriate scRNA-seq platform and software tool for a scRNA-seq study. Using these guidelines, scientists can select the workflow that will yield the most meaningful results.

On October 20th, NCI launched the Imaging Data Commons (IDC), the latest data repository to be offered within the Cancer Research Data Commons (CRDC) infrastructure. Through the IDC, both researchers and clinicians will have access to a wide range of cancer-related images, including radiology and pathology imaging data, as well as their accompanying metadata.

NCI offers a broad range of fellowships, enabling us to tap some of the brightest minds in science and technology to further advance scientific discovery. Here, Joseph Flores-Toro, Ph.D., a fellow in the Office of Data Sharing (ODS), describes the path he took in becoming a fellow for CBIIT.

Dr. Tony Kerlavage, director of NCI’s Center for Biomedical Informatics and Information Technology (CBIIT), sat down to discuss one key component of racial inequality, the issue of health disparities, as it relates to Big Data. As noted by Dr. Kerlavage, representing our diverse U.S. population in research and in the workforce are key, but we also need better data.

This new blog installment shines a spotlight on the staff who are working to turn data and IT resources into solutions for addressing data-driven cancer research. This spotlight features Sherri de Coronado, program manager in the CBIIT Cancer Informatics Branch.

NCI initiatives are accumulating a wealth of data from the fields of genomics, proteomics, single-cell, radiology, molecular imaging, clinical findings, and more. The newly awarded Cancer Data Aggregator (CDA) is currently being designed and developed to allow scientists to crosstalk among these very diverse data sets, facilitating interoperability not only within the Cancer Research Data Commons but throughout the larger data ecosystem.

The quest to harmonize data has ushered in a new way of thinking about standardization. Now, rather than expecting everyone to adopt a particular model or standard, we’re seeking to leverage technology that can do some of this work for us. The DREAM Challenge was designed to make aggregating and mapping data to the correct lexicon of terms and metadata a nearly seamless step for researchers. Read more about the Challenge that’s currently underway and how we hope to address harmonization in the future.