Cancer Data Science Pulse

The Cancer Data Science Pulse blog provides insights on trends, policies, initiatives, and innovation in the data science and cancer research communities from professionals dedicated to building a national cancer data ecosystem that enables new discoveries and reduces the burden of cancer.

This is the first of a series of posts that discuss the pilot collaborative program “Joint Design of Advanced Computing Solutions for Cancer (JDACS4C)” being pursued by the National Cancer Institute (NCI) and the Department of Energy (DOE).

In 2016, a Blue Ribbon Panel (BRP) was established, as part of the Beau Biden Cancer Moonshot, to make key recommendations that would support the Moonshot goals of accelerating progress in cancer research and breaking down barriers to developing new treatments.

In the past year, the use of Artificial Intelligence (AI) in radiology, also called "radiomics," has been getting a lot of attention, mainly because of the progress Deep Learning (DL) has made from a sub-human performance to performance that is similar, or in some cases superior, to that of humans.

Now is the time for researchers across domains to ideate together, share data, and maximize the utility of those data. This is "the urgency of now" according to former Vice President Joe Biden, who delivered the keynote address to those in attendance at the September 2017 Human Proteome Organization (HUPO) Annual World Congress.

The data science community is awash with "FAIRness." In the past few years, there has been an emerging consensus that scientific data should be archived in open repositories, and that the data should be Findable, Accessible, Interoperable, and Reusable.

I recently joined NCI to help support strategic data sharing and informatics projects within the Center for Biomedical Informatics and Information Technology (CBIIT). Having worked on information management at another Institute for five years and the trans-NIH Big Data to Knowledge (BD2K) initiative since its inception, this is an exciting opportunity for me to continue to contribute to enhancing data science across the biomedical community.

Biomedical research is evolving with an increasing emphasis on data science, e.g., data integration and storage, data privacy and security, data analytics and data representation, driven by the transformative technologies that have become the currency of genomics in precision medicine. In spite of numerous "beachhead" successes, however, the gap between data and clinical utility continues to grow.

In recent years, genomics has been described as a big data science on par with the likes of Twitter, YouTube, and the scientific pursuit of understanding the universe.

Precision medicine has quickly moved to the forefront of clinical research and practice, and is particularly pertinent to cancer since cancer is a disease of the genome. The need to accelerate discovery in cancer research has been further propelled by the Beau Biden Cancer Moonshot, challenging the community to make a decade's worth of progress in five years.

The recent weeks have been momentous as the high-performance computing (HPC) community embraced the challenge of precision medicine. The theme of this year's leading international supercomputing conference, SC16, was "HPC Matters" and it was evident that HPC matters to precision medicine and that precision medicine matters to the high-performance computing community.