Federated Learning
About Federated Learning
Federated learning offers a decentralized yet collective approach to accessing, analyzing, and interpreting data. Rather than moving data from its original location, researchers like you can apply advanced analytical models—including artificial intelligence (AI) and machine learning (ML)—to gain deeper insight into cancer and its effects.
This approach enables hospitals and research centers to collaborate on AI models without exchanging sensitive patient data. By doing so, federated learning helps overcome challenges like small sample sizes and population biases, while mitigating privacy concerns and regulatory hurdles. Instead of centralizing data, participating groups send the algorithm to each institution where it learns locally; researchers share only model updates, not patient data.
In short, federated learning bridges the gap between data privacy and collaborative research, ultimately enabling the development of more accurate predictive models and more personalized treatment recommendations for patients with cancer.
NCI’s Role
NCI has built a federated learning network among several cancer centers, and we are looking to expand it. By connecting cancer centers, federated learning will facilitate answering research questions that have historically had small sample sizes at individual institutions. Federated learning enables investigators to build off of each other’s data in a secure way. NCI is also aiming to create a self-sustaining network where the value of the work produces the revenue needed to sustain said network.
NCI’s Center for Biomedical Informatics and Information Technology (CBIIT) plays a key role in federated learning efforts. In addition to funding and engaging in the current initiative through its Informatics and Data Science Program—specifically via the Clinical and Translational Research Informatics Branch and Computational Genomics and Bioinformatics Branch—CBIIT is proactively preparing for future governance needs. As federated learning expands and more groups join, robust ethical and legal guidance becomes essential. In pursuit of a self-sustaining network, NCI aims to:
- create a comprehensive governance framework,
- actively participate in scientific model developments, and
- jointly develop standardized data annotations and model card standards.
For example, CBIIT and the participating sites have been working on a member agreement and a transactional agreement to outline the consortium’s rules and procedures, and a bylaws document is in progress to describe the committee structure and operating guidelines.
Furthermore, CBIIT is developing model cards for federated learning network-tested models. These CBIIT model cards provide details about the ML models. For example, they capture critical information such as:
- the basics (i.e., name, version, purpose).
- data requirements and formats (i.e. what data the model will process).
- privacy and security protocols (i.e., how the network members will protect data privacy).
- technical specifications (i.e., hardware or software required to run the model).
- performance metrics (i.e., how the network members assess the quality of the model predictions).
By providing standard model cards, the federated learning network system enables cross-institution collaboration and ensures that all participants can accurately assess their ability to test and train the model.
Connecting the Cancer Community
NCI recently funded a federated learning pilot that included itself, Oregon Health Sciences, and the Children’s Hospital of Philadelphia. During the pilot, NCI set up a network that tested and trained models to show that federated learning was a viable approach. The program has expanded into 2025 and now includes NCI and three other cancer centers (e.g., University of Hawaii, Moffit Cancer Center, and Wake Forest Cancer Center). Together, these centers tested the connectivity of the federated learning network. They are establishing a federated learning network that will enable each center to run their models against the data of each of the other centers. Currently, they’re preparing to test the network by leveraging a simple test model.
These centers proposed several models. They range from predicting the likelihood of patients developing cachexia to predicting a patient’s response to chemotherapy. The first model that the network plans to test and train will focus on using mammography and single nucleotide polymorphisms to predict a person’s breast cancer risk.
Federated learning brings excellent chances for collaboration to solve challenging problems in cancer research. If you’re an NCI or NIH intramural researcher studying a rare disease or working with a small subset of patients, and you need additional participants to test your models, working with cancer centers that are participating in this federated learning network could be a valuable resource for you. You can contact us at NCIClinicalInformatics@mail.nih.gov for more information.
You can also contact us at the same email if you’re a researcher at a cancer center and you think the federated learning methodology might help your research!
Additional Information
Ask any and all general inquires about federated learning at NCI by emailing NCIClinicalInformatics@mail.nih.gov.