How to Write a Data Management and Sharing (DMS) Plan
If you’re a researcher seeking NIH funding, you’re likely aware that you need to prepare and submit a detailed Data Management and Sharing (DMS) Plan along with your funding application. In this article, we’ll highlight key components of this DMS Plan.
Please note that this isn’t an exhaustive description of what’s needed for NIH. For detailed instructions, we encourage you to refer to the application guide, as well as requirements outlined in your funding opportunity.
What is a DMS Plan?
The DMS Plan reflects NIH’s 2023 DMS Policy. With this policy, NIH is looking to significantly expand data sharing, ensuring that the research community has timely access to NIH-funded scientific data. (For general information on NIH policy expectations for sharing research data, visit the Scientific Data Sharing website.)
Do I Need a DMS Plan?
The policy pertains to you if:
- you’re applying for NIH funding (e.g., through a new or renewing grant, contract, or other NIH transaction), or
- you’re a scientist at NIH funded to generate or work with data.
What Qualifies as Scientific Data?
The 2023 DMS Policy defines scientific data as recorded, factual information that’s useful for replicating research findings. Those data may or may not be part of a scholarly publication.
Other Considerations
In addition to the 2023 DMS Policy, there may be other policies that apply to your work. For example, if you’re conducting large-scale genomic research, your DMS Plan should include elements described in the NIH Genomics Data Sharing (GDS) Policy (e.g., type of genomic assay, number of subjects and timeline for submission to an NIH-designated repository, etc.). NIH’s GDS Policy applies to all NIH-funded research (e.g., grants, contracts, intramural research, regardless of funding level) that generates or reuses genomic data from large-scale human or non-human research.
Why Do I Need a DMS Plan?
If you're seeking NIH funding, you’re required to submit a DMS Plan. When you submit a DMS Plan, you’re making certain you comply with current policy.
But you’re not simply checking off a box in your NIH application. By enabling others to leverage your data quickly and easily, you’re ensuring that everyone in the cancer community can reap the benefit of your work.
Moreover, if you take time now (before your research begins) to think through your data processes, you can facilitate funding decisions and avoid costly delays. You’ll have your data work done!
What Do I Need to Address in the Plan?
NIH identified six key elements to consider in your plan. Those core elements will help ensure that the data you collect, manage, and share have a meaningful impact on future research. For additional guidance, see the “Supplemental Information to the NIH Policy for Data Management and Sharing: Elements of a Data Management and Sharing Plan.”
What Format Works Best?
The format you use can vary. To keep it simple, you might want to consider using a template, such as a table, to capture important details about your data and your approach to managing and sharing those data. Below is an example of some key items to capture in your DMS Plan.
Data Types to be Generated/Collected | Data to be Shared |
Brief Description of Methodology (Including Timelines) | Software/ Codes for Accessing/ Manipulating Data |
Data Standards | Name(s) of Repositories | Estimated Data Sharing Timelines | Oversight of DMS |
---|---|---|---|---|---|---|---|
Genomics/Genetics |
Yes, all data sets |
200 human subjects before and after therapy | GATK – open source |
FASTQ, BAM, VCF (HG38) HTAN model |
SRA |
Submission: L2/L3 data to SRA by June 2025 Release: December 2025 |
Institutional officials will provide annual oversight |
Clinical Data (Routine Care and Clinical Research) |
Yes, all data sets |
200 human subjects before and after therapy | No special software needed to access and use the data | JSON, CaDSR |
dbGaP deidentified (Safe Harbor) |
Submission: June 2025 Release: December 2025 |
Institutional officials will provide annual oversight |
For additional ideas on formatting your DMS Plan, visit the Federal Demonstration Partnership. There, you’ll find pilot templates to help you create your DMS Plan.
What are the Six Core Elements that I Need to Address?
Element 1: Describe Your Data
Describe the types and amounts of data that you expect to generate, manage, preserve, and share. You’ll want to acknowledge any differences in the data you’ll be working with (i.e., data you generate/collect vs. existing data you manage and use in your research) and give a justification for any data you won’t be sharing. Be sure to include:
- the type, format, and amount (or file size) of data you expect to collect.
- For example, how many images (at what resolution and scale) will you collect for your research participants?
- the state of your data and an estimate of the amount of data you’ll share.
- Will it be raw or processed? If processed, what methods will you use and how will the data be prepared? Will you share all levels of data or just processed data?
- any limitations on your data sharing.
- In other words, will you share all or only part of your data? Be sure to include any justification (e.g., legal or ethical limitations) that might prevent you from sharing any of your NIH-funded data. Writing “I don’t think my data will be useful for the broader community” or “My data are too small” isn’t a sufficient rationale.
- the methods you’ll use for preserving your data.
- You also should address any ethical, legal, or technical factors that apply.
- the process you’ll use to assign metadata.
- For instance, What data standards and other documentation (e.g., study protocols and data collection instruments) will you be using to help others discover and reuse your data in future research projects?
Element 2: Describe the Tools, Software, or Code Needed to Access and Manipulate Your Data
Describe any specialized tools, software, and/or code that others will need to access or work with your shared scientific data. Be sure to include:
- the tools others will need for accessing or working with your data and how to access those tools.
- For example, are the tools open source, commercial, or available from the research team?
Element 3: Define the Standards You’ll Be Using With Your Data
Describe the metadata and standards you’ll use to improve your data’s operability. Be sure to include:
- the data standards you’ll apply to your data.
- List any data dictionaries, data identifiers, other documentation. If no data standard consensus exists, make a note of this and describe how you’ll compensate for the lack of consensus (i.e., how you’ll structure and describe your data according to best practices).
Element 4: Explain How You Will Preserve Your Data and When Those Data Will be Made Available (Including Four Sub-Elements)
Describe how and when you’ll be archiving your scientific data and metadata. Be sure to include:
- the names of the repository(ies) where you will preserve and share your data.
- a timeline for when the data will be available to other researchers.
- You should share your results as soon as possible (i.e., by the time you publish your first results or by the end of the performance period, whichever comes first.)
- an estimate for how long you will preserve and share your data.
- Keep in mind your repository’s policies, as well as any applicable expectations of NCI program and journal policy expectations. If you’re sharing some of your data for a longer (or shorter) period, you should flag these differences in timelines.
- the tools you’ll be using.
- For example, will you be using a persistent unique identifier or other standard indexing tool to make data findable and accessible for end users?
- When setting realistic timelines for preserving and sharing your data, be sure to consider tasks that could alter your data sharing timeline (e.g., repository policies, award record retention requirements, journal publication schedules). You’ll need to indicate if any of your data subsets have different timelines. If you’re completing a GDS and DMS Plan, be sure the timelines align and meet policy expectations.
- Make certain your data are FAIR (i.e., findable, accessible, interoperable, and reusable) and include any necessary digital object identifiers, accession numbers, and hyperlinks to data sets).
- You may deposit different types of data from the same participants into several repositories, as long as the secondary users are aware of where to find and access those data sets.
- Your repository(ies) may include Generalist Repositories, unless otherwise specified by the funding opportunity announcements or NCI-specific policy(ies).
- If you plan on depositing data into one of NCI’s Cancer Research Data Commons (CRDC) repositories, such as GDC, PDC, CDS, ICDC, be aware that you’ll need prior approval.
Element 5: Explain How Data Users Will Access and Reuse Your Data
Describe how others can access and reuse your scientific data for each data type. Find examples of justifiable reasons for limiting the sharing of data. Be sure to include:
- a thorough description of how others can access your data.
- Will the data be open access or available with permission? How will you manage that access (e.g., through prior registration with the repository).
- information on how you will protect privacy and confidentiality of human research participants.
- Will you use de-identification, Certificates of Confidentiality, other protective measures?
- a full justification for any limitations on your data sharing.
- Will factors such as state or Tribal law or lack of consent be a concern?
- Include any restrictions imposed by federal, Tribal, or state laws, regulations, or policies.
- Consider any existing or anticipated agreements (e.g., with third-party funders, partners, and/or HIPAA-covered entities). Those agreements may require additional protections on health information that could impact your DMS Plan.
Element 6: Describe the Oversight of Your Data Management and Sharing
Describe how you’ll govern your DMS Plan. Be sure to include:
- the names and titles of staff who will be monitoring your plan.
- the approach and schedule they’ll use for monitoring your plan.
DMS Plan Resources, Tools, and Initiatives
Now that you have a sense of what your plan should contain, use the following resources to find additional information for refining your DMS Plan.
Resources and Tools
- DMS Plan: Learn what NIH expects in a DMS Plan and get additional direction on developing a plan.
- Repositories: Visit our NCI Data Catalog for a list of data collections produced by major NCI initiatives and other widely used data sets. Also, visit the NIH Scientific Data Sharing website for a full list of NIH-supported repositories.
- Training Modules: Watch these training modules on how to enhance data reproducibility.
Tools to Help Standardize Data
- NCI Thesaurus
- NIH Common Data Element Repository
- The National Library of Medicine data glossary
- Health Information Technology and Health Data Standards at NLM
- minimal Common Oncology Data Elements (mCODE)
- The Mondo Disease Ontology
- Digital Imaging and Communications in Medicine (DICOM)
- Binary Alignment Map (BAM)
- Compressed Reference-oriented Alignment Map (CRAM)
- Browser Extensible Data (BED)
- Align to human reference genome 38
- DataCite Metadata Schema
- The Dublin Core™ Metadata Initiative
- Fairsharing.org
- Human Phenotype Ontology
- Observational Medical Outcomes Partnership (OMOP)
- HL7 FHIR® (Fast Healthcare Interoperability Resources: NOT-OD-19-122)
Blogs
- Your Guide to the 2023 NIH Data Management and Sharing Policy: See the differences between the most recent 2023 policy and the older 2003 version.
- Breaking Down Barriers to Sharing Cancer Data—The NIH Generalist Repository Ecosystem Initiative: Discover how NIH is working to make generalist repositories (GRs) part of the data sharing ecosystem. The goal is to minimize sharing barriers while still taking advantage of GR convenience and usability.
- Data Sharing Advocacy—How a Cancer Survivor Seeks to Enhance Data Sharing to Better the Patient Experience: Read this personal testimony from Mr. Steve Friedman—a cancer survivor and NCI employee—who has witnessed firsthand the power of data science and sharing tools.
Projects
- NCI’s Informatics Technology for Cancer Research (ITCR) Program has training courses available via the ITCR Training Network. The course, “Ethical Data Handling for Cancer Research” has tips on data privacy, security, sharing, and ethics.
Publications
- Data Sharing in the Context of Community-Engaged Research Partnerships. Social Science & Medicine, 2023. | See recommendations you can use to better engage a broader community participation in data sharing.
- Ten Simple Rules for Maximizing the Recommendations of the NIH Data Management and Sharing Plan. PLoS Computational Biology, 2022. | Get additional tips on the core elements of a DMS Plan.
- Implementing the FAIR Data Principles in Precision Oncology: Review of Supporting Initiatives. Briefings in Bioinformatics, 2020. | See this systematic review of initiatives that follow FAIR data principles, as well as best practices for supporting data interoperability and reusability.