Cancer Data Science Pulse
A Unique Opportunity, a Profound Responsibility
During my career, I’ve held several positions in the public, non-profit, and private sectors. I worked on the team that sequenced the human genome, studied protein structure-function relationships as a bench scientist, and managed large bioinformatics and professional service organizations. I’m very proud of the work I’ve accomplished in each.
But my new position as the director of NCI CBIIT is the one in which I believe I can have the most impact, and which carries with it the most opportunity and the most significant responsibility.
In a recent blog post about the Childhood Cancer Data Initiative, Doug Lowy refers to having “an opportunity and a responsibility to use this moment” to learn from every child with cancer. I couldn’t agree more, and as Doug and I have discussed, we need to extend that goal to learn from all cancer patients. What this means from a pragmatic perspective is that the data from all cancer patients, whether being treated in a Cancer Center, a community clinic, or as part of a clinical trial, must be available to researchers for analysis to generate new discoveries. This is no easy task, but NCI is in a great position to lead in this area.
We have made tremendous progress in cancer detection and treatment in the past 20 years – the overall death rate from cancer fell by 26% from 1991 to 2015. But too many people are still getting, and dying from, cancer. About 38% of Americans will get a cancer diagnosis in their lifetime. There are over 1.7 million newly diagnosed cases of cancer each year in the U.S., and over 600,000 people die annually of the disease. Expenditures for cancer care annually are well over $147 billion. These statistics don’t relate the human suffering of the patients struggling with chemotherapy and its side effects, the emotional toll on families supporting their loved ones, and the pain of losing someone to cancer. In addition, by 2026, there will be over 20 million cancer survivors in the U.S. How do we ensure they remain survivors and eliminate the chances their disease will recur?
As the largest organization in the world dedicated to cancer research, NCI has a responsibility to use every means at its disposal to accelerate progress in cancer treatment. NIH is the largest public funder of biomedical research in the world – investing $39 billion every year to prevent and cure disease and improve the lives of patients. Of that, $6 billion is NCI’s budget, with 70% going directly to fund research. As director of CBIIT, one of my responsibilities is ensuring that tax-payer dollars are having the biggest impact possible, and that we fulfill the CBIIT mission to accelerate cancer research by empowering scientists and envisioning their informatics, data science, and IT needs in the future.
In the past ten years, we have seen a convergence of technical and scientific innovations that set the stage for the kind of data sharing and analysis we need to accelerate progress in cancer. New research tools and advances in information technology have made it possible to conduct cancer research in ways not imaginable a decade ago, and data are being generated at an unprecedented rate.
NCI has been at the forefront of leveraging these discoveries. As cancer researchers’ focus turned to the molecular basis of cancer, the NCI Center for Cancer Genomics (CCG) created the Genomic Data Commons (GDC), to provide broad access to genomic data generated in NCI studies. We also began working with academic and commercial partners to develop NCI Cloud Resources, giving researchers access to these data and the ability to run their analyses in a cloud environment, rather than having to download the data. In the past two years, CBIIT also began the development of the Cancer Research Data Commons (CRDC) to provide access to many data types in addition to genomic data.
Cancer research is now at a turning point. Recognizing that broad data sharing, team science, and collaboration are necessities if we are to learn from every cancer patient, coalitions are forming across academia, government, and industry. There is a broad consensus that applying standards to data storage and collection must be a significant consideration when establishing a repository or setting up a Data Coordinating Center for a new trial. Groups such as Global Alliance for Genomics & Health (GA4GH) and Clinical Data Interchange Standards Consortium (CDISC) are working across academia and industry to generate common approaches to data collection. The National Cancer Advisory Board Data Science Working Group recognized these imperatives in several of their recommendations, including making data FAIR (Findable, Accessible, Interoperable, Reusable), integrating real world data, and cross-training clinicians and informaticists.
As the new director of CBIIT, my vision is to expand on the informatics and IT work that leverages the opportunity before us, to make data sharing the norm, to encourage collaboration and team science, and to ensure cancer researchers can take full advantage of the advances in technology to expedite their research. A few of my specific priorities are:
- Extending the utility of the Cancer Research Data Commons: Two commons nodes, the Genomic Data Commons (GDC) and the Proteomic Data Commons (PDC), are available and several are in development, including the Integrated Canine Data Commons, Immuno-oncology Data Commons, Imaging Data Commons, and Clinical Trials Commons. Additionally, discussions are underway to develop Population Science and Clinical Data nodes. Providing interoperability and query capability among these repositories will be accomplished through the recently-launched Center for Cancer Data Harmonization (CCDH) and the Cancer Data Aggregator (CDA), which will start development shortly. As each of these nodes and components is stood-up, the ability to search and share diverse data will increase, providing the ability to analyze data and generate insights not previously possible. This kind of integrative cancer research is exactly what is needed to help us learn as much as we can from each cancer patient’s journey.
- Ensuring the success of the Childhood Cancer Data Initiative: This new initiative offers NCI a tremendous opportunity to progress research in pediatric cancer. Because childhood cancers are so rare, research and progress have lagged behind adult cancers. This initiative will specifically focus on enhancing data collection and federation, and ensuring data are accessible to the broadest possible community of pediatric oncology researchers. CBIIT co-coordinated a workshop in July that will help shape the program, and we will work collaboratively with our partners and the cancer research community to support whatever infrastructure and data science needs are identified. The potential effect of such a program on children with cancer and their families is immeasurable. Additionally, this initiative can serve as a model from which we can learn and extend to adult cancers.
- Providing critical IT support for the NCI: The NCI intramural research community, extramural program directors, and the administrators across NCI are collectively the engine that drives cancer research. Our priorities for NCI directly support improved efficiencies and data-driven decision-making, and leverage innovative technologies for critical day-to-day NCI functions. For example, we are working with partners across NCI to improve our business and grant management systems, which will reduce administrative burden and provide insights needed for strategic portfolio management. We are also working with intramural researchers on a strategy for sustainable research data management and tool deployment, including cloud solutions, to ensure our NCI colleagues, and ultimately extramural researchers, can take full advantage of NCI’s investment in intramural research and the data it generates. Of particular importance to all of NCI programs are the advanced cybersecurity approaches we provide that ensure the integrity of NCI data and infrastructure and the privacy of study participant’s data.
It’s unusual to have the opportunity to make a positive impact on so many people, and I intend to take full advantage of it – I have a responsibility to do so. Ultimately, our objectives across organizations in the cancer community are aligned, whether we work for public institutions, academia, or private companies: find treatments, improve outcomes, and make the lives of cancer patients and their families better. I look forward to working in my new role to fulfill our responsibility to learn from every cancer patient and to apply that knowledge to reduce the burden of cancer.
- Data Sharing (57)
- Informatics Tools (34)
- Genomics (33)
- Data Commons (32)
- Data Standards (29)
- Precision Medicine (23)
- Seminar Series (22)
- Data Sets (21)
- Machine Learning (19)
- Artificial Intelligence (13)
- Leadership Updates (12)
- High-Performance Computing (HPC) (9)
- Imaging (7)
- Policy (7)
- Training (7)
- Funding (5)
- Jobs & Fellowships (4)
- Proteomics (4)
- Semantics (3)
- Information Technology (2)
- Publications (2)
- Awards & Recognition (1)
- Childhood Cancer Data Initiative (1)
Leave a Reply