Cancer Data Science Pulse
Cancer Genomics Cloud Pilots DREAM Challenge - Leveraging the Wisdom of the Crowd
In recent years, Challenges have become a popular way to engage and motivate the research and innovation communities to solve difficult problems. Challenges are open competitions where communities are presented with specific and often difficult problems to solve. Participants are given guidelines and test data, and are challenged to compete to find the best solution. Open competition encourages innovative thinking, provides for broad participation, allows funders to set ambitious goals, and is a cost-effective way to encourage collaboration and generate novel solutions. As the volume and complexity of data continues to increase, it is critical to develop new methods to use data to address fundamental questions to better understand and improve biological sciences and human health.
DREAM is an acronym that stands for Dialogue for Reverse Engineering Assessments and Methods, but over the years it has evolved into the "dream" of collaboration, sharing data, and open science being the norm in science, not the exception. NCI is funding a new DREAM Challenge as part of the Cancer Genomics Cloud (CGC) Pilots program.
The ICGC-TCGA DREAM Somatic Mutation Calling - RNA DREAM Challenge aims to find the best algorithms for detecting abnormal RNA molecules in a cancer cell, and will be the first Challenge to make use of the CGC Pilots. Contestants submit their algorithms, not their results, to the evaluation. All the algorithms developed during the Challenge will be run on the CGC platforms. The CGC Pilots will provide access to co-located data and shared tools as well as some free compute credits that can be used by participants and for final Challenge scoring.
In this Challenge, participants will be given a set of RNA sequence data and will be asked to develop algorithms that perform two tasks. The first task is to measure the amount of different isoforms of mRNA contained in the sequencing data. Alternative splicing is a process where a single gene can code for multiple isoforms of mRNA and, consequently, multiple proteins. Alternative splicing generates a tremendous amount of protein-level diversity in humans and affects various functions in cellular processes, tissue specificity, and developmental states. Evidence also exists for a link between misregulation of gene splicing and cancer.Some mRNA produced by the gene has little or no deleterious effect on the body; however, some lead to abnormal proteins which can cause cancer, and some tumor-specific isoforms have been identified.
The second task in the Challenge is to detect gene fusions in the RNA. Gene fusions occur when two or more genes are joined together during a process such as translocation or inversion. Fusions can create abnormal, hybrid proteins. For this part of the Challenge, participants will not know what gene fusions they are looking for; it will be up to them to develop the algorithms that find them. Fusions are known to be driver mutations in certain kinds of cancers and detecting them can help with both diagnosis and prognosis for cancer patients.
NCI is working with the Ontario Institute for Cancer Research and University of California, Santa Cruz to implement the DREAM Challenge. Sage Bionetworks is also providing support and the infrastructure to host a component of the Challenge via their Synapse platform.
To learn more about the genomic concepts in this post, visit:
- The Education Section of the National Human Genome Research Institute: https://www.genome.gov/education/
- NCI Dictionary of Genetic Terms: http://www.cancer.gov/publications/dictionaries/genetics-dictionary
- Scitable, A Collaborative Learning Space for Science: http://www.nature.com/scitable