News

Machine Learning Approach Helps Interpret Genome-Wide Dynamics

Are you using machine learning (ML) to gain insight into the genes underlying cancer but worried you might be misinterpreting some key information?

Researchers funded by NCI’s Division of Cancer Biology have developed a new ML application that uses a neural ordinary differential equations (ODEs) “solver”—a type of ML that interprets how a system will behave over time. Their method, called PHOENIX (or “Prior-informed Hill-like ODEs to Enhance Neuralnet Integrals with eXplainability”), helps model how regulatory proteins (transcription factors) influence their target genes.

Read the full report, “Biologically Informed NeuralODEs for Genome-Wide Regulatory Dynamics,” in Genome Biology. PHOENIX is available via Github. You can find the source code for the results presented in the article at Zenodo.

Researchers apply neural ODE solvers in a variety of settings. Still, when faced with the complexity of modeling tens of thousands of transcription factors and genes in the human genome, neural ODE solvers often fall short and so fail to identify how genes change over time.

According to the authors, PHOENIX takes into account both the complexity of the system and what’s known about how gene transcription works (for example, which of the transcription factors bind to the regulatory region of each gene). Using that information as a starting point, PHOENIX learns time-dependent functions to predict how each gene will behave in a particular disease state, as it develops and progresses.

PHOENIX’s scalability is another important advance. According to the first author, Dr. Intekhab Hossain, of Harvard T.H. Chan School of Public Health, solving the “scalability problem” was a key motivation in developing the method.

Dr. Hossain said, “When scientists pick a small subset of genes and transcription factors, they could miss vital or unexpected elements of the biological processes that drive cancer’s development over time. The full picture only began to emerge when we used PHOENIX to examine the entire repertoire of genes.”

Dr. John Quackenbush, senior and corresponding author, added, “With PHOENIX, we not only were able to identify genome-scale networks, but we also pinpointed subtle but biologically important regulatory changes that altered how cancer cells function. Such biological interpretability is essential if we want to use ML methods to make generalized predictions and to peer into the ways in which disease manifests itself. Such understanding is essential if we are to develop better treatments.”

Vote below about this page’s helpfulness.