Cancer Data Science Pulse

Machine Learning

On Wednesday, September 22, 2021, Yanjun Qi, Ph.D., from the University of Virginia, will present “AttentiveChrome: Deep Learning for Predicting Gene Expression from Histone Modifications,” in the kickoff of the Fall Data Science Seminar Series. This blog offers insight on Dr. Qi’s research and why this topic is important to her.

What do winter storms, airplanes, and cancer research have in common? In this blog, experts on meteorology, aerospace engineering, and radiation oncology explore what we can learn from these very different fields to further advance how we target and apply radiation to more effectively treat cancerous tumors.

Artificial Intelligence offers boundless possibilities, especially in the healthcare field. In a recent CBIIT Data Science Seminar, Dr. James Zou showed how Computer Vision (CV) is helping create a new data-driven “language of morphology” that allows researchers to be more precise in interpreting histological images. Just as computers help propel self-driving cars along busy roadways, CV offers a faster, less-subjective method for assessing disease.

Dr. Tony Kerlavage, director of NCI’s Center for Biomedical Informatics and Information Technology (CBIIT), sat down to discuss one key component of racial inequality, the issue of health disparities, as it relates to Big Data. As noted by Dr. Kerlavage, representing our diverse U.S. population in research and in the workforce are key, but we also need better data.

One of the most exciting developments of the past decade has been the success of methods broadly described as deep learning. While the roots of deep learning date back to early machine learning research of the 1950s, recent improvements in specialized computing hardware and the availability of labeled data have led to significant advances and have shattered performance benchmarks in tasks like image classification and language processing.

This blog post, the fifth, concludes our series that discusses the principles underlying the collaborative project "Joint Design of Advanced Computing Solutions for Cancer (JDACS4C)."

NCI continues to identify and link external data sources with SEER data to enable the expansion of longitudinal data to form patient trajectories and to support modeling efforts. To inform the incorporation of those additional sources, NCI compiled an extensive breast cancer recurrence data dictionary to identify recurrence-related data elements across multiple sources, including pathology, radiology, pharmacy, biomarkers, procedures, comorbidities, patient-generated information, and radiation oncology.

This is the third in a series of posts that discuss the principles underlying the three-year collaborative program "Joint Design of Advanced Computing Solutions for Cancer (JDACS4C)."

This is the second of a series of posts that discuss the principles underlying the three-year collaborative program “Joint Design of Advanced Computing Solutions for Cancer (JDACS4C).”