Cancer Data Science Pulse

Data Sharing

Converting the many petabytes of cancer data available on the cloud from information to answers is a complex task. In this blog, Deena Bleich shares how the ISB Cancer Gateway in the Cloud (ISB-CGC), an NCI Cloud Resource, hosts large quantities of cancer data in easily accessible Google BigQuery tables, expediting the process.

This blog offers a primer on semantics, a topic that has broad implications for the biomedical informatics and data science fields. Here, Gilberto Fragoso, Ph.D., describes the structures that serve as a foundation for data science semantics. Those systems help improve data interoperability, allowing researchers to query, retrieve, and combine very different data sets for more extensive analysis.

In this latest Data Science Seminar, Jim Lacey, Ph.D., M.P.H., shares the lessons he learned in transitioning a large cancer epidemiology cohort study to the cloud, including the importance of focusing on people and processes as well as technology. Project managers, principal investigators, co-investigators, data managers, data analysts—really anyone who is part of a team that wants to use the cloud or cloud-based resources for their studies—should attend.

The diversity, complexity, and distribution of data sets present an ongoing challenge to cancer researchers looking to perform advanced analyses. Here we describe the Cancer Genomics Cloud, powered by Seven Bridges, an NCI Cloud Resource that’s helping to bring together data and computational power to further advance cancer research and discovery.

To commemorate the National Cancer Act’s 50th anniversary, we’ve pulled together Five Data Science Technologies poised to make a difference in how cancer is diagnosed, treated, and prevented.

Technological advancements, such as machine learning and artificial intelligence, have made open data sharing more complex and put new pressure on existing laws that protect data privacy. This blog examines the privacy processes and policies that are helping address privacy concerns in today’s ever-changing “big data” landscape.

CBIIT Director Dr. Tony Kerlavage loves data. He also loves to solve puzzles. In this blog, Dr. Kerlavage describes how advances in data and technology are helping us find solutions to some of today’s most pressing cancer-related questions. According to Dr. Kerlavage, there’s no problem too hard to solve and no question that we shouldn’t bother asking.

We love how data science is changing the way we look at the world. In this latest blog, Drs. Kibbe and Almeida discuss why they love data and how scientific methods can help us better understand our natural world, our universe, and ourselves.

To the NCI Cancer Research Data Commons, cloud computing means three words: NCI Cloud Resources. These are real-world examples of making data accessible and available to all cancer researchers. Kicking off the first of a four-part blog series, the NCI Cloud Resources share their origin story and the problems that cloud computing could solve in cancer research.

Data have been the driving force behind a number of important scientific discoveries. In this latest blog, Dr. Jerry Li describes how data helped power technological advances to unravel the human genome. What’s the next big advance? According to Dr. Li, the blending of data and artificial intelligence is the fastest moving area of research and has the potential to once again revolutionize scientific discovery.