Cancer Data Science Pulse

Data Standards

Ever wonder what it’s like to work on a data ecosystem? Meet software engineer Ming Ying, and website specialists Hannah Stogsdill and Ambar Rana, as they describe what it’s like to design, develop, implement, and maintain NCI’s Integrated Canine Data Commons.

Watch our time capsule video to learn about the current status of the field and new technologies that are sure to be important as we embark on the next era of cancer data research.

Discover how NIH is working to make generalist repositories (GRs) part of the data sharing ecosystem. The goal is to minimize data sharing barriers while still taking advantage of GR convenience and usability.

This blog offers a primer on semantics, a topic that has broad implications for the biomedical informatics and data science fields. Here, Gilberto Fragoso, Ph.D., describes the structures that serve as a foundation for data science semantics. Those systems help improve data interoperability, allowing researchers to query, retrieve, and combine very different data sets for more extensive analysis.

In this latest Data Science Seminar, Jim Lacey, Ph.D., M.P.H., shares the lessons he learned in transitioning a large cancer epidemiology cohort study to the cloud, including the importance of focusing on people and processes as well as technology. Project managers, principal investigators, co-investigators, data managers, data analysts—really anyone who is part of a team that wants to use the cloud or cloud-based resources for their studies—should attend.

On November 3, Dr. Duran will present the next Data Science Seminar, “Social Determinants of Health.” This blog offers insight into Dr. Duran’s work and why this topic is important to her.

On May 24, CBIIT welcomed Dr. Jill Barnholtz-Sloan as the new associate director for Informatics and Data Science. In this latest Q&A blog, Dr. Barnholtz-Sloan tells a little about herself, including what brought her to CBIIT, what keeps her centered, and what makes her most proud.

CBIIT’s May 19 Data Science Seminar Series speaker, Dr. Kristen Naegle, took the speed of computational biology, blended it with basic science know-how, and developed an algorithm that is proving to be remarkably effective in predicting kinase activity. Understanding kinases in oncology may help identify people who are more likely to respond (or not respond) to certain medications, further advancing precision medicine.

Dr. Charles Wang offers a sneak peek at his upcoming Data Science Seminar presentation, scheduled for April 7. His recent study provides guidance for choosing an appropriate scRNA-seq platform and software tool for a scRNA-seq study. Using these guidelines, scientists can select the workflow that will yield the most meaningful results.

On October 20th, NCI launched the Imaging Data Commons (IDC), the latest data repository to be offered within the Cancer Research Data Commons (CRDC) infrastructure. Through the IDC, both researchers and clinicians will have access to a wide range of cancer-related images, including radiology and pathology imaging data, as well as their accompanying metadata.