News
Important Reminders About the Use of AI and Protecting Human Genomic Data
With the increasing development and use of artificial intelligence (AI) in cancer research, it’s important to remain vigilant about potential risks of data disclosure when you develop or share AI tools and applications.
What You Need to Know
- You may only distribute controlled-access data (including genomic or associated data) or their data derivatives to an entity or individual that you identified in the data access request. This is in compliance with the NIH Genomic Data Sharing (GDS) Policy and subsequent Data Use Certification (DUC) Agreement.
- Sharing, retaining, or training generative AI models using controlled-access human genomic data may risk disclosing that data, which would violate the non-transferability provision of the DUC.
- You may not share controlled-access data with public, generative AI tools (e.g., third-party) via prompts or other user interfaces. This is stated in the GDS Policy and the Genomic Data User Code of Conduct. Such sharing would be a violation of the provision on non-transferability and, by extension, the DUC.
- If you’re a developer requesting access to controlled-access data for your work, you must comply with the non-transferability provision in the developer terms of access.
Want to learn more about how you can use controlled-access data responsibly? Follow the principles in “Using Genomic Data Responsibly Under the NIH Genomic Data Sharing Policy” webpage and the “AI in Research: Policy Considerations and Guidance” webpage.
Until NIH provides further guidance, you may continue developing generative AI models with controlled-access data if you’ve received approval from NIH. However, as an approved user, you:
- must delete them at project closeout per the DUC.
- may renew expiring projects to continue using the models.