Cancer Data Science Pulse

Data Standards

This blog offers a primer on semantics, a topic that has broad implications for the biomedical informatics and data science fields. Here, Gilberto Fragoso, Ph.D., describes the structures that serve as a foundation for data science semantics. Those systems help improve data interoperability, allowing researchers to query, retrieve, and combine very different data sets for more extensive analysis.

In this latest Data Science Seminar, Jim Lacey, Ph.D., M.P.H., shares the lessons he learned in transitioning a large cancer epidemiology cohort study to the cloud, including the importance of focusing on people and processes as well as technology. Project managers, principal investigators, co-investigators, data managers, data analysts—really anyone who is part of a team that wants to use the cloud or cloud-based resources for their studies—should attend.

On November 3, Dr. Duran will present the next Data Science Seminar, “Social Determinants of Health.” This blog offers insight into Dr. Duran’s work and why this topic is important to her.

On May 24, CBIIT welcomed Dr. Jill Barnholtz-Sloan as the new associate director for Informatics and Data Science. In this latest Q&A blog, Dr. Barnholtz-Sloan tells a little about herself, including what brought her to CBIIT, what keeps her centered, and what makes her most proud.

Did you ever wonder what goes into making data ready for analysis by researchers around the world? Introducing “Datum.” This single speck of data was conceptualized to show how NCI’s Center for Biomedical Informatics and Information Technology supports cancer research by bringing data to life.

CBIIT’s May 19 Data Science Seminar Series speaker, Dr. Kristen Naegle, took the speed of computational biology, blended it with basic science know-how, and developed an algorithm that is proving to be remarkably effective in predicting kinase activity. Understanding kinases in oncology may help identify people who are more likely to respond (or not respond) to certain medications, further advancing precision medicine.

Dr. Charles Wang offers a sneak peek at his upcoming Data Science Seminar presentation, scheduled for April 7. His recent study provides guidance for choosing an appropriate scRNA-seq platform and software tool for a scRNA-seq study. Using these guidelines, scientists can select the workflow that will yield the most meaningful results.

On October 20th, NCI launched the Imaging Data Commons (IDC), the latest data repository to be offered within the Cancer Research Data Commons (CRDC) infrastructure. Through the IDC, both researchers and clinicians will have access to a wide range of cancer-related images, including radiology and pathology imaging data, as well as their accompanying metadata.

NCI offers a broad range of fellowships, enabling us to tap some of the brightest minds in science and technology to further advance scientific discovery. Here, Joseph Flores-Toro, Ph.D., a fellow in the Office of Data Sharing (ODS), describes the path he took in becoming a fellow for CBIIT.

Dr. Tony Kerlavage, director of NCI’s Center for Biomedical Informatics and Information Technology (CBIIT), sat down to discuss one key component of racial inequality, the issue of health disparities, as it relates to Big Data. As noted by Dr. Kerlavage, representing our diverse U.S. population in research and in the workforce are key, but we also need better data.