Scalable, Collaborative, Reproducible, and Extensible Analysis of TCGA Data in the Cloud
The Seven Bridges Cancer Genomics Cloud pilot is one of three pilot projects funded by the National Cancer Institute. The overarching goal of the project is to explore how co-localizing large genomics datasets, like The Cancer Genome Atlas, with dynamic compute infrastructure to analyze them, can make learning from these data faster, and ultimately enable precision medicine.
In this seminar we’ll highlight four guiding principles that have driven development of the Seven Bridges CGC:
Making data available isn’t enough to make it usable: We’ve built a dynamic query engine that allows fast search of more than 140 clinical and biospecimen properties to enable finding interesting TCGA data faster and easier. Importantly, data are immediately available for analysis at scale using both pre-defined and custom workflows.
The best science happens in teams: A fine-grained permissions model allows transparent collaboration; in a secure and compliant manner.
Reproducibility shouldn’t be hard: Each analysis, including all parameters, files, and software versions is fully logged and can be perfectly replicated days or months later.
The impact of TCGA is amplified by new data and tools: Researchers can readily bring their own data, and their own tools to analyze alongside TCGA data. Native implementation of the Common Workflow Language (CWL) specification enables portability of tools and workflows to and from other CWL-compliant systems.
The seminar will include a demo of the system and interested researchers can visit www.cancergenomicscloud.org to get involved.
Brandi Davis-Dusenbery is the Scientific Program Manager for the Seven Bridges CGC. She received her Ph.D. in Biochemistry from Tufts University and completed her postdoctoral studies at Harvard University. She is passionate about enabling the use of biomedical data to ultimately improve patient care.
- Cancer Research Data Commons and Other NCI Infrastructures in Support of Data ScienceSeptember 19, 2021AttentiveChrome: Deep Learning for Predicting Gene Expression from Histone ModificationsSeptember 22, 2021“Le Grand et Le Petit”: Splicing Factors SF3B1 and SUGP1 and Their Cancer Mutations Leading to Aberrant Acceptor UsageSeptember 22, 2021The Future of Clinical Trial Data Sharing.... The Art of The PossibleSeptember 23, 2021Genomic Data Commons Single Cell RNA-Seq SupportSeptember 27, 2021Virtual Workshop on Next-Generation Sequencing and Radiomics: Resource Requirements for Acceleration of Clinical Applications Including AISeptember 29, 2021 - September 30, 2021