Cancer Data Science Pulse

The Cancer Data Science Pulse blog provides insights on trends, policies, initiatives, and innovation in the data science and cancer research communities from professionals dedicated to building a national cancer data ecosystem that enables new discoveries and reduces the burden of cancer.

Our newest “Spotlight” features Jennifer Kwok, program manager in CBIIT’s Infrastructure and Information Technology Operations Branch. Much of her work centers on developing IT solutions for NCI staff and organizations to streamline and optimize their business functions and processes.

“Count Me In” (CMI) is a unique project that gives patients an opportunity to share their cancer-related data directly with scientists. According to Corrie Painter, associate director of CMI, this is a largely untapped but vital part of data science. Here she describes the project and what it could mean for future research efforts.

On October 20th, NCI launched the Imaging Data Commons (IDC), the latest data repository to be offered within the Cancer Research Data Commons (CRDC) infrastructure. Through the IDC, both researchers and clinicians will have access to a wide range of cancer-related images, including radiology and pathology imaging data, as well as their accompanying metadata.

NCI offers a broad range of fellowships, enabling us to tap some of the brightest minds in science and technology to further advance scientific discovery. Here, Joseph Flores-Toro, Ph.D., a fellow in the Office of Data Sharing (ODS), describes the path he took in becoming a fellow for CBIIT.

Dr. Tony Kerlavage, director of NCI’s Center for Biomedical Informatics and Information Technology (CBIIT), sat down to discuss one key component of racial inequality, the issue of health disparities, as it relates to Big Data. As noted by Dr. Kerlavage, representing our diverse U.S. population in research and in the workforce are key, but we also need better data.

Naturally occurring cancers in dogs share similarities with cancer that occurs in humans. The Integrated Canine Data Commons (ICDC), a cloud-based repository of canine cancer data, includes a variety of molecular, clinical, pharmacological, and medical imaging information from pet dogs. Such comparative oncology findings offer researchers greater insight into how best to diagnose, treat, and prevent cancer—in both people and pets.

The explosion of genetic information and direct access to large-scale genomic data not only opens up new areas for exploring today's most pressing research questions, it also serves as a reminder of the importance of collaboration at every stage of the study. NCI’s Dr. Daoud Meerzaman describes a new "circular" way of collaborating that keeps everyone in the loop when devising new genomics studies.

This new blog installment shines a spotlight on the staff who are working to turn data and IT resources into solutions for addressing data-driven cancer research. This spotlight features Sherri de Coronado, program manager in the CBIIT Cancer Informatics Branch.

The use of bioinformatics in cancer research helps organize vast amounts of data to allow researchers to identify information, trends, and correlations, ranging from very large populations of people all the way down to a single molecule. NCI's Dr. Daoud Meerzaman describes how data science is helping pinpoint the contributions of a single protein, bromodomain protein 4 (BRD4), and its effects on the development of tumors and progression of cancer.

NCI initiatives are accumulating a wealth of data from the fields of genomics, proteomics, single-cell, radiology, molecular imaging, clinical findings, and more. The newly awarded Cancer Data Aggregator (CDA) is currently being designed and developed to allow scientists to crosstalk among these very diverse data sets, facilitating interoperability not only within the Cancer Research Data Commons but throughout the larger data ecosystem.