Cancer Data Science Pulse
“Count Me In” Gives Patients a Voice in Scientific Discovery
On November 10, my colleague Dr. Nikhil Wagle and I had the opportunity to describe a remarkable patient-researcher collaboration, called “Count Me In” (CMI), as part of the “Partnering with the Public for Biomedical Research Seminar” series. CMI is a non-profit organization led by The Broad Institute of MIT and Harvard, the Emerson Collective, the Dana-Farber Cancer Institute, and the National Cancer Institute’s (NCI’s) Cancer MoonshotSM initiative.
What makes the program unique is that it creates a new pipeline for clinical and genomic cancer data by partnering with patients to collect information.
This type of “citizen science” is a largely untapped but vital part of data science. It gives patients an opportunity to share their data directly with scientists. Those data include clinical and patient-reported information, as well as samples from tumors, saliva, and blood for genetic analysis.
Sharing such data is key to finding answers to how and why cancer develops, and how best to treat certain cancers. The data are fully de-identified, meaning that the information isn’t used to make clinical decisions for individual patients. However, by making the information available to the broader scientific community, patients have the opportunity to contribute to significant breakthroughs in treatment and, hopefully, reap the rewards of this unique partnership.
CMI has enrolled more than 8,000 patients from across the United States and Canada in projects that encompass metastatic breast cancer, angiosarcoma, metastatic prostate cancer, esophageal and stomach cancer, brain cancer, and osteosarcoma.
Accessing Data
The driving force behind CMI is data—clinical, genomic, molecular, and patient-reported information unique to the individual (e.g., such as family history).
The Metastatic Breast Cancer Project (MBCproject) was one of the first cancers to be explored within this program. Processed data from this initial launch is routinely deposited into the public domain and can be found at cBioPortal. To date, 180 patients and 237 samples have been made available and can be accessed through this portal. MBCproject data include Whole Exome Sequencing and RNA sequencing from each tumor sample. And these data are growing.
The full data sets from this program (and other CMI cancer programs), including controlled-access raw sequencing files and other data potentially containing sensitive genotype information, are also deposited into the Genomic Data Commons (GDC). The GDC was the first repository established within NCI’s Cancer Research Data Commons infrastructure. Serving as the hub for genomic data, GDC gives researchers access to a large collection of uniformly processed data sets, with the added benefit of a cloud-based infrastructure and many best-in-class bioinformatic tools and workflows.
“Programs like CMI understand the power of data sharing and play a critical role in precision oncology,” says GDC Project Officer, Zhining Wang, Ph.D. “By making their data broadly accessible to the research community through the GDC, CMI is giving the people affected by these diseases a chance to harness their samples and data toward precision medicine.”
The Heart of the Project
The heart of CMI is the patients. Their altruistic contributions to this program are truly what sets CMI apart and what ultimately will make it a success. For the past 5 years, I’ve worked directly with patients across all of our projects, and I’m moved by every one of them. They are so committed to this project, and it gives them a purpose. They want to spare others from the suffering that they currently are enduring. It’s both heartbreaking and inspiring.
Patients can consent online to donate their stored tumor samples, saliva samples, medical records, and, most importantly, their individual stories. One person can make a difference. But having 100,000 lend their voices and their data gives us an even greater chance at success.
Resources
Genomic data from CMI projects can be found through the GDC Data Portal. For information on accessing protected data, see the NIH Database of Genotypes and Phenotypes (dbGaP). To request access to CMI data, be sure to cite the appropriate CMI study (study accessions as follows: phs001931, The Angiosarcoma Project; phs001709, The Metastatic Breast Cancer Project; phs001939, The Metastatic Prostate Cancer Project).
To learn more about CMI, see @MBC_project, @ASCaProject, and @Corrie_Painter on Twitter; and the Metastatic Breast Cancer Project and the Angiosarcoma Project Working Group on Facebook.
Categories
- Data Sharing (64)
- Informatics Tools (39)
- Training (39)
- Genomics (35)
- Data Standards (34)
- Precision Medicine (32)
- Data Commons (32)
- Data Sets (26)
- Machine Learning (24)
- Seminar Series (22)
- Artificial Intelligence (20)
- Leadership Updates (13)
- Imaging (12)
- Policy (9)
- High-Performance Computing (HPC) (9)
- Jobs & Fellowships (7)
- Funding (6)
- Proteomics (5)
- Semantics (5)
- Information Technology (2)
- Awards & Recognition (2)
- Publications (2)
- Request for Information (2)
- Childhood Cancer Data Initiative (1)
Leave a Reply