Cancer Data Science Pulse
Blending Weather Forecasting with Team Science Leads to Advances in Cancer Immunotherapy
You’ll be discussing the topic, “Multi-omics Modeling for Predictive Cancer Immunotherapy,” in the upcoming webinar. Can you tell us what first interested you in this topic area?
I’m a mathematician by training. My early work was in using time-course data for weather prediction. That field has been highly effective in blending data and computational methods to arrive at consistently accurate predictions that truly impact people’s lives. For example, we can use models and high-throughput data to predict the risk for a winter storm or hurricane path, enabling us to issue advanced warning to keep residents safe.
Toward the end of graduate school and at the start of my post-doctorate work, multi-omics analysis through microarrays was just coming into vogue. As a mathematician, you’re trained to look for new ways to apply what you know to other disciplines. I saw a potential for using the multi-dimensional microarray data to develop predictive models in cancer biology similar to how satellite data are used for weather prediction. I soon realized, however, that high-throughput temporal profiling of cancer was a lot more complicated. Then, single cell technology exploded. These data empower temporal profiling in biological systems in a way that we never could before.
It was incredibly exciting to be in the field at the start of this technology boom and to apply what I’ve learned across disciplines to cancer biology.
Who should attend the webinar? What can they expect to learn from this hour with you?
I think this topic has a broad appeal. People in biological or clinical research, computational science, biostatistics, mathematics—really anyone who is interested in using technology and data science to advance cancer research.
I hope to show how powerful analysis of single cell data can be when they’re based on the biological mechanisms underlying a cancerous tumor and its environment, and how those assessments change over time. We’re getting at the very heart of tumor biology, which, in turn, is helping us better understand the basic science of cancer, as well as how the body will respond to treatment. This is powerful, not only because it helps us develop more accurate predictive models, but also because understanding the mechanisms underlying that response will enable us to develop new, more effective therapies.
How would this technology help in identifying effective treatments? Is this similar to precision medicine?
I think of this field as predictive rather than precision medicine. In precision medicine, the idea is to match a medication to a patient’s tumor at a time when the diagnosis is made. The problem with this approach is that it disregards the time element; that is, it ignores the influence of how cancer evolves over time. The tumor and its microenvironment are constantly changing over time.
Our artificial intelligence methods, combined with spatial and single cell technologies, enable us to map changes in cellular phenotypes in a tumor and its microenvironment as a tumor responds to therapy. Combining the properties learned from these data with mathematical models enables us to model how a cancer will respond to treatment. This provides a framework for future work that will enable us to use computational tools to predict the changes associated with therapeutic resistance in cancer so that we can select treatments in advance, essentially stopping the disease in its tracks before it reaches the next stage in its evolution.
Have you been applying this technology to one particular cancer?
We’ve been using a pan-cancer approach, examining the genomic and cellular alterations that occur across a wide variety of tumor types, and looking at how we can develop a computational tool that can be applied broadly across all of these. In my presentation, I’ll be discussing our findings in melanoma, breast, and liver cancer. However, what we found is that many of these basic biological underpinnings of therapeutic response and resistance span cancer types.
You’re using an approach that includes CoGAPS. Can you tell us a bit about that?
CoGAPS is a machine learning method we established to identify transcriptional signatures related to cell type and state. It allows us to distinguish fundamental (i.e., low dimensional) processes that sum up a biological system. My talk will look at how this can be applied to cancer immunotherapy. But the tool is also being applied more broadly to developmental biology and neuroscience. Tools like these are quickly becoming standard practice for single cell data analysis.
Were there any surprises that you encountered in your work on this topic?
I started developing the CoGAPS tool in the early days of microarrays, so I’m surprised how sustainable it’s been across technologies, including modern single cell technologies. That was something I never expected. It’s been very durable.
I’ve also been amazed at how the field of computational cancer biology has grown. I remember in the beginning, when I first started applying this tool to time-course data, my postdoc mentor said, you’re never going to have the same amount of biological data as you do with weather. It’s amazing to see how that’s changed, with so much biological data emerging in this field today.
Not everything has been positive though. When I began working with biological data, I was surprised by how much people tended to “silo” their data. As a mathematician, I came from a field with 100% open data, so it was a new experience to encounter data sets that were “owned” by someone. I was surprised by some researchers’ hesitancy to share data, which are confounded by privacy concerns, making it that much harder to share information.
Have there been other challenges that you’ve had to overcome?
In terms of biology, and looking at tumor evolution over time, the biggest challenge is that we’re modeling what happens in humans as they respond to therapy. That’s the gold standard. The patient’s care must come first. It can be challenging to obtain the sample, conduct the processing, and get the data from the clinic to the research lab. It requires a highly coordinated team effort.
That team approach is key. It’s not a one-woman show. It includes clinical investigators who are taking care of patients and offering the most effective therapies; surgeons who are extracting the tissues; pathologists who are looking at the tissues and getting us the best samples; and the technology and computational scientists who are generating and interpreting high-throughput data. It’s truly a team-science endeavor. It took time, but our team is a great example of what can be accomplished when you build trust and all work together. Unfortunately, that team effort doesn’t always reflect the current system of scientific discovery. Funding, publication cycles, tenure decisions—all tend to reward an individual and not a team approach. If we’re not careful, this can stand in the way of producing high-quality work.
You mention humans as the “gold standard,” but are other models being used to refine the tool?
Yes, preclinical animal model work is absolutely critical. We’re not yet in a place where we can apply a prediction made from data directly to a clinical trial, although I hope we get there one day. Instead, we’re using cross-species analyses to understand what is similar between the mouse and humans.
Using a transfer learning approach, we’re identifying underlying processes in biological data that relate to therapeutic response. For example, high-throughput data may identify an increase (or decline) in certain immune cells at a certain point in time in response to a therapy. We developed a new tool, ProjectR, that enables us to query everything we have learned about the response to therapy in one system (in this case, the mouse) and see how it relates to another (in this case, human treatment response).
We’re able to validate that our prediction is computationally preserved between the two systems (mouse and human). Once confirmed, the information can be used to predict how the patient will respond, and also aid in finding new biomarkers.
Where do you see this technology headed in the next 5–10 years?
The field is skyrocketing right now. What I really hope is that we get to a point where we can perform 4-dimensional molecular profiling; that is, models that take into account both space and time. Just as a weather forecast uses satellite data to predict stormy weather, I’d like to apply this across the genomic landscape. I know it’s ambitious, but I’ve never seen technology moving so fast. I know we’ll get there.
What satisfies you the most about this work? What makes you the most proud?
Honestly, I’m most proud of my mentees. I love working with trainees and cultivating new ideas. Teaching and mentoring are some of my favorite things about my work.
I also enjoy solving puzzles, which is why I like math. My work gives me the opportunity to find new solutions to puzzles every day.
Given your strong propensity for teaching and mentoring, is there any advice you could offer based on what you’ve learned in the field?
When training in science there’s a tendency to think intra-disciplinary. You may think you’re locked into a particular area. If you’re a biologist, you’re locked into biology. If you’re a data scientist, you’re locked into data science.
There’s a lot more to be gained from transdisciplinary research. I’d encourage biologists to learn to code and mathematicians to learn to interpret biology. There’s so much more we can learn from this type of convergent science. Bringing people together across disciplines has the greatest potential for moving the field forward.