Skip to Main Content

Publishing

Guide to best practices for publishing and resources for publishing different types of research outputs.

Data Management

Data management is the process of validating, organizing, protecting, maintaining, and processing scientific data to ensure the accessibility, reliability, and quality of the data for its users.

Proper data management helps maintain scientific rigor and research integrity. Keeping good track of data and associated documentation lets researchers and collaborators use data consistently and accurately. Carefully storing and documenting data also allows more people to use the data in the future, potentially leading to more discoveries beyond the initial research.

- Information from the NIH website

If you are looking for repositories, please check the Where should I publish? Datasets section.

If you are looking for NIH specific information, please scroll down to our NIH section.

FAIR Guidelines

We encourage researchers to manage their data while keeping in mind FAIR principles. FAIR stands for:

  • Findability
  • Accessibility
  • Interoperatability
  • Reuse

Examples of how to make data FAIR include, but are not limited to: having DOIs, using Creative Commons or other clear and accessible data usage licenses, and using standardized language and organization. Following these principles helps ensure that data will not disappear or be forgotten. You can learn more about how to apply FAIR at the GO FAIR website.

NIH Guidelines

As of January 25th, 2023, the NIH expects applicants to submit a plan for how they will manage and share their data and allows applicants to include certain costs associated with data management and sharing in their budget. This includes all NIH-supported research regardless of funding level, including extramural grants, extramural contracts, intramural research projects, and other funding agreements. Data is defined as any data needed to validate and replicate research findings, and does not include laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects such as laboratory specimens.

Your data plan should include the following:

  1. Data type - Summarize the types and estimated amount of scientific data expected to be generated in the project. Describe which scientific data from the project will be preserved and shared. NIH does not anticipate that researchers will preserve and share all scientific data generated in a study. Researchers should decide which scientific data to preserve and share based on ethical, legal, and technical factors. The plan should provide the reasoning for these decisions. Briefly list the metadata, other relevant data, and any associated documentation (e.g., study protocols and data collection instruments) that will be made accessible to facilitate interpretation of the scientific data.
  2. Related tools, software, and/or code - State whether specialized tools, software, and/or code are needed to access or manipulate shared scientific data, and if so, provide the name(s) of the needed tool(s) and software and specify how they can be accessed.
  3. Standards - State what common data standards, if any, will be applied to the scientific data and associated metadata to enable interoperability of datasets and resources, and provide the name(s) of the data standards that will be applied and describe how these data standards will be applied to the scientific data generated by the research proposed in this project. If applicable, indicate that no consensus standards exist.
  4. Data preservation, access, and timelines - Provide the name of the repository(ies) where scientific data and metadata arising from the project will be archived. Describe how the scientific data will be findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools. Describe when the scientific data will be made available to other users (i.e., no later than time of an associated publication or end of the performance period, whichever comes first) and for how long data will be available.
  5. Access, distribution, or reuse considerations - NIH expects that in drafting Plans, researchers maximize the appropriate sharing of scientific data. Describe and justify any applicable factors or data use limitations affecting subsequent access, distribution, or reuse of scientific data related to informed consent, privacy and confidentiality protections, and any other considerations that may limit the extent of data sharing. State whether access to the scientific data will be controlled (i.e., made available by a data repository only after approval). If generating scientific data derived from humans, describe how the privacy, rights, and confidentiality of human research participants will be protected (e.g., through de-identification, Certificates of Confidentiality, and other protective measures).
  6. Oversight of data management and sharing - Describe how compliance with this Plan will be monitored and managed, frequency of oversight, and by whom at your institution (e.g., titles, roles).

For further information, you can check the NIH website or the OSF Data Management checklist.

The NIH has an official example of a data management plan, and the OSF Data Management working group also has a searchable database of example data management plans. The NIH has also provided a suggested list of NIH preferred repositories.

PLoS Computational Biology has also published a paper called Ten simple rules for maximizing the recommendations of the NIH data management and sharing plan.

For more information, please contact CRIO or email the NIH directly at sharing@nih.gov.