Overview
The National Institutes of Health (NIH) implemented a new data management and sharing policy (DMSP) on January 25, 2023. High-level aspects of the policy are below:
- Requires submission of a Data Management & Sharing (DMS) Plan with the grant application, and compliance with that Plan as it was approved by NIH program staff (plans are reviewed, not scored).
- Applies to new applications and renewals submitted on or after January 25, 2023 (does not apply to any applications prior to that date).
- Applies to all research funded whole or in part by NIH that generates scientific data, including clinical data.
- Specific offices and institutes (e.g., National Institute of Mental Health) may have additional DMS Plan requirements.
- DMS Plans should be 2 pages long.
- DMS Plans can be updated throughout the lifecycle of the grant.
Resources
- NIH Data Management & Sharing Policy Overview
- NIH DMSP FAQs
- Quick videos and workshops from UMN Research Data Services
- Overview of the policy (5:08)
- Where to share your data (3:29)
- Sharing human participant data (3:47)
- Data management and sharing policy workshop (54:10)
Get help
- UMN Research Data Services, a partnership between the Libraries and CLA's LATIS, offers education, training, and consultation on data management and data sharing across all disciplines. See contact information in the left pane.
- For DMS Plan review, please send a google document draft to data@umn.edu, and we will review it within three business days. We may also request to meet with you to discuss the initial review, and will need two to three days to schedule that.
Find a repository
Steps for finding a data repository
- Check your Notice of Funding Opportunity or with your NIH Institute, Center, or Office (ICO) to determine whether either encourages the deposit of data into a specific data repository.
- Explore the list of NIH-Supported Data Sharing Resources to see whether a repository listed under your ICO, subject area, or model system is a good fit for your data.
- Share in a general data repository or DRUM.
UMN data repository memberships
- Inter-University Consortium for Social and Political Research (ICPSR) is a repository for social science research. It offers full curation, member-only, and restricted access to data. Learn more about depositing in ICPSR.
- Dyrad is a repository for the sciences, but also takes general data. It offers workflows for peer review of data and open-access data sharing.
Data Repository for the University of Minnesota (DRUM)
- DRUM is a free public-access repository with no associated curation or deposit fees.
- DRUM meets many of the NIH and OSTP recommended features of data repositories.
- However, DRUM is not suitable for all kinds of data. See limitations below:
- Human Participant Data Policy: data should be non-sensitive, de-identified, and have clear participant consent for open-access sharing.
- Data Collection Policy: data should be owned by the depositor or depositors must have clear rights to share, and documentation should be sufficient to understand the data.
- DRUM is not a good fit for large datasets (individual files over 5GB and submissions over 50GB total), which are difficult to both submit and access through DRUM.
Download a detailed handout from Research Data Services: Data sharing repository selection tool
Repository comparison
Features | DRUM | ICPSR | OpenICPSR | Dryad | ||
Public-access sharing | yes | no | yes | yes | yes | yes |
Offers controlled/ restricted access | no | yes | yes | yes | no | yes |
Who controls access requests | n/a | repository | repository | depositor | n/a | depositor |
Allows custom terms of use | no | no | no | yes | no | no |
Fee for data deposit | no | no for UMN | no | no | no for UMN | no |
Fee for data access | no | for non-members | no | no | no | no |
Allows blind peer review | no | no | no | no | yes | yes |
Generalist Repository Comparison Chart maintained by NIH
Data Repository Finder maintained by NNLM
Budget for data sharing
NEW: Starting October 5th, 2023, NIH allows DMS costs to be budgeted in whatever budget category is appropriate to the actual cost (e.g., salaries, fringe benefits, other direct costs, etc.) This will expedite award setup. See NIH Application Instruction Updates - Data Management and Sharing (DMS) Costs for details.
NIH DMSP Budgeting Resource prepared by UMN Research Data Services and Sponsored Projects Administration.
Making Research Data Publicly Accessible: Estimates of Institutional & Researcher Expenses from the Association of Research Libraries' Realities of Academic Data Sharing Initiative surveyed NIH, NSF, and DOE funded researchers to assess the cost of data management and sharing (DMS). They found:
- 6% of total grant award was spent on DMS, which varied by grant award
- 15% for smaller awards (<$200,000)
- 1% for larger awards (>$1 million)
- $30,000 was the average cost incurred for DMS for a research project
- Staff time was the largest portion of this expense
Forecasting Costs for Preserving, Archiving, and Promoting Access to Biomedical Data from National Academies of Science, Engineering, and Medicine.
COGR Review of the Final NIH Policy for Data Management and Sharing: Budgeting and Costing from Council on Governmental Relations' NIH Data Management and Sharing Readiness Guide.
Other popular resources include National Data Archive’s cost estimation tool and Harvard’s tip sheet.
Human participant considerations
De-identification
When collecting data from and with human participants and communities, Element 5.C. of the DMS Plan requires that you describe protection of participants including de-identification of the data. Be explicit in the process you will use to address direct and indirect identifiers.
- Direct identifiers should be completely removed from data. This includes the 18 identifiers described in the HIPAA Safe Harbor Method and any other information that directly ties to an individual.
- Indirect identifiers require close examination for variables - that when combined with other variables, datasets, or publicly available information - could re-identify participants.
- The process of data curation usually involves some level of inspection for direct and indirect identifiers. Not all data repositories have data curators, and if they do, curation may not be a free service.
- Note that a de-identified dataset is not anonymized. When crafting DMS Plans, IRB applications, and participant agreements, using language such as "de-identified" or "confidential" is preferred.
Resources
- Guidance regarding methods for de-identification of protected health information in accordance with the HIPAA Privacy Rule
- Johns Hopkins' 5 steps for removing identifiers from datasets
- Consortium of European Social Science Data Archives’ anonymization of quantitative & qualitative data
- Finnish Social Science Data Archive anonymisation template
- Human participant data essentials primer by the Data Curation Network
- Qualitative data primer by the Data Curation Network
Informed consent language
When collecting data from and with human participants and communities, Element 5.A. of the DMS Plan requires that you describe how informed consent will be obtained for data sharing, and if there will be any access restrictions to the data related to consent. Be explicit in the language you will use in the consent form given that the Certificate of Confidentiality (issued to all NIH awardees) requires explicit consent for data sharing. Include:
- How the data will be processed before it is shared, including de-identification methods
- What data will be shared (and what data will not be shared)
- Where the data will be shared (name the specific repository)
- How the data will be accessed (publicly available or restricted to specific requesters)
- Who will grant access (the repository or the PI)
Resources
Patent and intellectual property considerations
Sharing data in a repository may qualify as a public disclosure under patent law. This can jeopardize UMN’s ability to obtain patent protection worldwide. NIH addresses the potential need to withhold data for the purpose of securing patents in frequently asked questions. Primary points are below:
-
“...evaluating an invention for patent protection or filing a patent application may justify a need to delay disclosure of research findings, as well as any scientific data underlying them.”
-
“A delay of up to 60 days beyond DMS Policy data sharing timelines is generally viewed as a reasonable period for these purposes.”
-
“Scientific data that are not the subject of a patent application…should be shared within expected timelines.”
If you expect a potentially patentable invention to result from your project, we recommend building additional time into your DMS Plan. Consider including language such as “In the event that patentable intellectual property results from this project, the University of Minnesota may require additional time to protect the intellectual property before sharing the relevant data, in accordance with the Bayh-Dole Act.” If your originally submitted DMS plan did not state the data would be withheld for a period of time due to patent planning, you may need to revise the plan. Contact your NIH Program Officer to amend that DMS plan.
Additionally, as soon as you realize that your research may result in a patent, contact the UMN Technology Commercialization (Tech Comm) office. Tech Comm needs 60 days to process patent applications.
Contract considerations
DMS plans may be impacted by contracts with external groups, such as data use agreements (DUA). A data use agreement may impact data ownership and whether or not the data can be shared with others (overriding the federal mandate for data sharing). Sponsored Projects Administration is the office that coordinates these contracts. Below are data use agreement templates for consideration:
Templates and examples
NIH DMS Plan template in DMPTool
- Log in to DMPTool using your UMN credentials to see DRUM-specific language and submit your DMS Plan to Research Data Services for review.
NIH institutes, centers, and offices may provide more specific guidance
- Look at the website of your specific funder to check for additional requirements. Below are a few examples.
Data Management Plan Database (browse and/or submit your own plan) by McMaster University
UMN examples (forthcoming)
DRUM boilerplate language
- Review DRUM policies to ensure your data can be shared in the institutional repository.
- This language may be copy/pasted into Element 4.A. of the DMS Plan and/or you can split the text below into Elements 4.A., 4.B., 4.C., and 5.B.
"The data will be shared via the Data Repository for the University of Minnesota (DRUM), an open access, publicly-accessible, institutional repository. DRUM has been certified since 2017 by CoreTrustSeal, an international community-based organization that recognizes sustainable and trustworthy repositories. Curators review submissions and work with data authors to comply with data sharing requirements in ways that make data findable, accessible, interoperable, and reusable (FAIR) - including, but not limited to, file transformation and metadata augmentation (Dublin Core is the metadata standard). DRUM commits to 10 years of long-term preservation using services such as file migration (limited format types), off-site backup, bit-level checksums, and Digital Object Identifiers (DOI) for archival citations. The DOI exposes data to online discovery tools like Google Scholar and Web of Science Data Citation Index."
- DRUM will accept data that meet its human participant data and collection policies. To request a specific letter of support for your grant, please send a request and any required template language to data@umn.edu.
Open Science Framework boilerplate language
-
Note: Be sure to use the template without hyperlinks for your DMS Plan, as links are not allowed.
General DMP Checklist
- Note: this is a general data management plan checklist useful for NSF, DOE, and other national agencies. It is not specific to NIH's DMS Plan requirements.
DMS plan reporting requirements
The National Institutes of Health (NIH) announced that grantees subject to its Data Management and Sharing Policy will now have to answer the following questions when submitting a progress report on awards:
- Whether data has been generated or shared to date
- What repositories any data was shared to and under what unique digital identifier
- If data has not been generated and/or shared per the award’s DMS Plan, why and what corrective actions have or will be taken to comply with the plan
- If significant changes to the DMS Plan are anticipated in the coming year, recipients will be asked to explain them and provide a revised DMS Plan for approval
The new reporting requirements apply to progress reports submitted on or after October 1, 2024.
Frequently asked questions
Click on a topic below to explore specific sections of Research Data Services' FAQ resource.
UMN data partners
Below is a list of our UMN data partners with whom we work closely in order to streamline data services on campus. You may find useful grant information on their websites.
- GEMS Platform. GEMS is a secure web-based informatics service for storing, cleaning, exploring, sharing, and analyzing agro-food data.
- Institutional Review Board. The IRB provides evidence of IRB approval at submission of grant award (check notice to see if this is required), institutional certification for genomic data sharing, and templates and instructions for writing about data sharing in IRB documents.
- Liberal Arts Technology and Innovation Services. LATIS Research is a team of methodological and technical experts who can provide support, technology, and infrastructure for research. Learn more about the ways they can support your grant and the services they provide.
- Masonic Cancer Center research development team. MCC’s research development team partners with faculty, trainees, and administrators to develop and submit competitive grant proposals by ensuring applicants meet sponsor guidelines and by improving the content, organization and visual appeal of proposal packages.
- Minnesota Supercomputing Institute. MSI and UMN Research Computing can guide grant seekers towards high-performance computing options for primary research storage of large amounts of data (typically up to 20TB) that has been processed on our systems, and can consult with researchers on long-term archiving options and plans.
- Research Cyberinfrastructure Champions. RCC is a cross-departmental network that helps to improve alignment and collaboration between research cyberinfrastructure service providers and the user community. Connect with your RCC champion for help navigating storage, compute, and other technology needs for your grant applications.
- Sponsored Projects Administration. SPA assists researchers with material transfer agreements (MTAs), data use agreements (DUAs), and research collaboration agreements (RCAs) for both sponsored and unfunded projects. Reach out to them at awards@umn.edu.
- Technology Commercialization. Tech Comm facilitates the protection and transfer of UMN innovation by helping to license research, patent innovative ideas, establish startups, and facilitate commercialization. Connect with Tech Comm to discuss how data sharing may impact intellectual property and commercialization.
- Technology Help. OIT can help to find the right technology to help you store, manage, and work with your data and files for smooth implementation of your DMS Plan.