Extramural Grantees Submitting Genomic Data
Extramural researchers funded by NCI should follow the step-by-step instructions below to submit genomic data to NIH and NCI repositories.
Submission Workflow
1. Prepare Data Sharing Documentation
Work with your Program Officer/Program Director to discuss the project, the Data Sharing Plan (DSP), and data certification process. The DSP should be consistent with NIH data sharing policies and NIH’s Guidance for Investigators in Developing a Data Sharing Plan. NCI expects DSPs will be collected, reviewed, and approved by the Program Officers prior to funding.
2. Submit Data Sharing Plan
Your DSP should be described in the Resource section of the Funding Application.
3. Submit Institutional Certification
The Institutional Certification is the document institutions use to attest that the plans for submitting large-scale human genomic data to NIH meets the expectations of the Genomic Data Sharing (GDS) Policy. Submit your Institutional Certification with the Principal Investigator and Institutional Signing Officials’ signatures* to the Program Officer or Genomic Program Administrator with Just-in-Time funding material.
For a multi-site project (with samples collected at several institutions), either:
- submit a single-site Institutional Certification from each site contributing samples, or
- submit a multi-site Institutional Certification. NIH understands the submitting institution may not be the local institution or Institutional Review Board of record for all sites. If the submitting institution chooses to submit a multi-site Institutional Certification, the submitting institution agrees to assure NIH that, based on either its own review or assurance from other institutions, the expectations and conditions of the Institutional Certifications(s) are met for the entire data set to be deposited.
If disease-specific data uses or data-use modifiers are required, please contact the NCI Office of Data Sharing for clarification.
The Institutional Certification assures that projects planning to submit genomic data to NIH will meet the expectations of the GDS Policy. The certification, provided by the submitting investigator and certified by the Institutional Signing Official (SO), must delineate any “data use limitations (DULs)” on the research use of the data, as agreed to in the informed consent documents signed by study participants and identified by the Institutional Review or Privacy Board reviewing the informed consent.
4. Generate and Clean Data
Clean the data according to accepted GDS practices (for a summary of guidelines review Preparing Genomic Data for Sharing) to ensure your data can be widely shared.
5. Submit Basic Study Information
After cleaning your data set, submit the Basic Study Information form describing the data set to the:
- Program Officer/Program Director
- Genomic Program Administrator
- NCI Office of Data Sharing
Please contact your Program officer and GPA for further information about this form.
6. Quality Check Data Submission
Perform appropriate quality assurance/quality control checks on the data and metadata.
7. Complete Full Data and Metadata Submission
Upload data and metadata files to the Database of Genotypes and Phenotypes (dbGaP) Submission Portal following the invitation to the PI assistant/submitter. To obtain a phs accession number, which references your study in dbGaP, you must include the:
- study configuration file.
- subject/sample mapping file.
NIH Repositories Data Set Release Expectations
dbGaP/NCI-approved repositories aim to release the data six months from the date of submission of a full, cleaned, and quality-checked data set or at the time of first publication (whichever comes first). For additional questions about submitting data to NCI and NIH Repositories, please contact the NCI Office of Data Sharing.
NIH/NCI Genomic Data Repositories Help Resources
If you encounter difficulties submitting data to a particular repository, please contact these help resources directly.
Repository | Help Resources |
---|---|
Database of Genotypes and Phenotypes (dbGaP) | |
Genomic Data Commons (GDC) |
|
Sequence Read Archive (SRA) |
|
Other National Center for Biotechnology Information (NCBI) Data Repositories |
For additional questions about data sharing, please contact the NCI Office of Data Sharing.