Skip to main content
PolyU HK
PolyU Library

Research Data Management: Major Components in DMP

Good Practices in Research Data Management

What Constitutes a DMP?

Generally speaking, a DMP consists of the following eight components: 

Different funders and publishers may have different requirements for DMP.  If your funder or publisher requires a DMP, it is important that you are aware of their requirements and comply accordingly. The DMP tools also have step-by-step instructions for you to build a DMP according to the requirements by major US and Europe funders.

1. Administrative Information

Lists out important administrative details of the research project and existing policies related to data management.

Project name

 Project ID or reference number provided by the funder and/or institution.

 Funding body/bodies, if any.

 Principal Investigator(s) - Name, ORCID, email and contact number.

 Date of First Version of the DMP & Last Updated Date

Project description

Helps others understand your project and purpose for data creation or collection.

  • What is the nature of your research project?
  • What research questions are you addressing?
  • For what purposes are the data being collected or created? 

 Related Policies

Lists out any other relevant requirements on data management, sharing and security issues, e.g. any funder requirements or existing policies in data aspects.

You should include a brief summary, or at least acknowledge the requirement and provide a reference to the policy, e.g. a hyperlink or include a full version in an annex to this plan.

  • Example of Related Policies:
  •  
  • "The research team will comply with the following ordinances and policies related to data management, sharing and security issues.

Personal Data (Privacy) Ordinance (ap. 486)
https://www.elegislation.gov.hk/hk/cap486@2014-12-05T00:00:00

Copyright Ordinance (Cap. 528)
https://www.elegislation.gov.hk/hk/cap528@2016-05-27T00:00:00

Policies related to data collection, retention and access in Section 6.1.3.1 of Research Handbook prepared by Research Office of the University

Code of Ethics for Research Involving Human Subjects in Section 6.4 of Research Handbook prepared by Research Office of the University

Policy on Ownership of Intellectual Property (PIP) issued by Innovation and Technology Development Office of the University"

2. Data Collection

Describes what kind of data will be generated from your project and how the data will be collected or created.

 Data Description

Articulates and justifies your choice of data format. You should list the type of file you intend to use, along with a reference to the software you will or might use to create it and a one-sentence reason why you intend to use that file format. Consider the implications in terms of storage, backup and access. If existing datasets can be reanalyzed, you are encouraged to mention them and briefly describe how you intend to use them.

  • What will be the anticipated data type, format & volume?
  • Do your chosen formats and software allow sharing and long-term access to the data?
    (e.g. non-proprietary software and software based on open standard)
  • Are there any existing data that you can reuse?
  • Example of Data Description:
  •  
  • "The data to be generated are mainly (1) survey data, (2) audio recording and (3) transcript from interview. Survey data will be stored in SPSS format for analysis purpose. Audio recordings will be created in MP3 format. Interview transcripts will be stored in MS Word and then be imported into NVivo for analysis purpose. All of the data will be converted into .wav, .csv, .txt and .pdf format after analysis stage for sharing and long-term access purpose. It is expected that the overall data size will not exceed 5GB.
  •  
  • We have examined the datasets from similar studies produced by Smith and Nieminen (2013). Some of their data might be suitable for comparing the research findings of this project."

 Data collection or creation 

Explains how you can ensure the data collected are valid, reliable and accurate.

  • How will you collect or acquire data?
  • How will you organize your data?
  • Will you use any file naming schemes?
  • How will you handle versioning?
  • What quality assurance processes will you adopt, e.g. standardized and consistent procedures to collect, process, transcribe, check, validate and verify data via standard protocols, templates or input forms?
  • Examples of Data Collection:
  •  
  • "All the critical questions in the online survey will be set as required questions to avoid missing response. Interviews will be audio-recorded to ensure accuracy in transcription process. Transcripts will be reviewed by the relevant interviewee to avoid misinterpretation.
  •  
  • Files will be stored under a meaningful folder structure. They will be named consistently, with information of project id, creation date in YYYYMMDD format, a short but adequate description, with proper version number."

3. Documentation & Metadata

Specifies how your data can be understood by researchers, including yourself, in future.

 Documentation and metadata that will accompany with the data

Identifies and uses existing community description standards wherever possible. Generally research datasets are not understandable without additional explanatory documentation.

  • What information is needed for the data to be to be read and interpreted in the future?
  • How will you capture or create the documentation and metadata?
  • What tools do you need for documentation?
  • Will you use supporting documentation such as protocols, data dictionaries, etc.?
  • What metadata standards will you use if any, and why?
  • Example of Documentation and Metadata:
  •  
  • "A README.txt will be kept to briefly describe the research, methodology used, and to introduce how all the project files are organized, named and stored.
  •  
  • Variables in SPSS can be independently understandable via meaningful variable names with entries in the build-in codebook. There will also be a separate document for the questionnaires and data analysis procedures.
  •  
  • Dublin Core, a widely adopted metadata standard will be used for this project.  All metadata will be created as soon as data is collected or created."

4. Ethics & Legal Compliance

Deals with the ethics and legal issues arising from the data generated from the research project.

 Ethical Issues

Demonstrates that you are aware of the following issues and have planned accordingly.

  • Have you gained consent for data preservation and sharing?
  • How will you protect the identity of participants if required, e.g. via anonymization?
  • How will sensitive data be handled to ensure it is stored and transferred securely?
  • Example of Ethical Issues:
  •  
  • "There will be a declaration at the beginning of the online survey that the data collected may be shared with other researchers after anonymization. Participants are free to withdraw if they do not agree with this. 
  •  
  • Proper informed consent will be obtained from the participants before interview. All data with personally identifiable information will be anonymized before analysis stage.
  •  
  • A non-sharable version of research data with personally identifiable information will be managed by the principal investigator and kept in PolyU Home Drive, a secure online storage provided by the University, for 5 years."

 Copyright and Intellectual Property Right Issues

Specifies copyrights and intellectual property right issues of your data.​

  • Who owns the data?
  • How will the data be licensed for reuse?
  • Are there any restrictions on the reuse of the third party data?  If yes, how will you negotiate a new license with the original supplier?
  • Any expected delay to data sharing, e.g. pending for patent application or embargo imposed by a journal publisher?
  • Example of Copyright and Intellectual Property Right Issues:
  •  
  • "The Hong Kong Polytechnic University will hold the intellectual property rights for the research data generated in this project, but the data will be shared via a third party open data repository - figshare."

5. Storage, Backup & Security

Explains how you will maintain and secure the valuable data collected from the research project.

 Storage and Backup

Provisions for storage should be included for systematic backups of the data.

  • Do you have sufficient storage or will you need to include charges for additional services?
  • How will the data be backed up?
  • Are your digital and non-digital data, and any copies, held in multiple safe and secure locations?
  • Who will be responsible for backup and recovery?
  • How will the data be recovered in the event of an incident?
  • Example of Storage and Backup:
  •  
  • "Data will be stored in the shared network drive which is managed by my department with password protection.
  •  
  • A second copy of data will be stored in PolyU Home Drive, a reliable and secure file storage service provided by the University. All files will be automatically backed up on a daily basis and retained for 7 days to protect against the accidental change or loss of the data.
  •  
  • A third copy of data will be stored in an external hard drive with password encryption after key stages. The hard drive will be stored in a cabinet with lock in the office of the principal investigator. 
  •  
  • A trained member of the research team, Ms. Li, will check the backups regularly to ensure they are usable in case of need."

 Security

Describes the security measures to be taken to protect the data.​

  • What are the risks to data security and how will these be managed?
  • How will you control access to keep the data secure?
  • How will you ensure that collaborators can access your data securely?
  • How will you train others on security protocols?
  • If data are collected with mobile devices, how will you ensure it will be safely transferred into your main secured systems?
  • Example of Security:
  •  
  • "Only members in the research team can access to the research data saved in the share networked drive. No data will be taken off-site, except for the audio recording of the interviews. Such audio contents will be recorded by an iPad provided by the department with password protection. The audio files will then be transferred from iPad to PolyU Home Drive immediately after the interview. All the information stored in iPad will be cleared on daily basis by the department staff."

6. Selection & Preservation

Lists out how will you select the data for long-term preservation and where will you preserve them.

 Selection

     Decides which data to keep and for how long.​

  • What data must be retained or destroyed for contractual, legal, or regulatory purposes?
  • How will you decide what data to keep?
  • How long will the data be retained and preserved?
  • Example of Selection:
  •  
  • "All anonymized data, metadata, and relevant documentation will be stored in figshare, an open data repository.
  •  
  • Data with sensitive personal information will be kept in PolyU Home Drive for 5 years. After that it will be destroyed through secure disposal service provided by the University."

 Preservation

Presents how data that have long-term value will be preserved beyond the research project.​

  • Where will you save the data for long-term access purpose, e.g. in any repository or archive?
  • Will you use any third party repositories to preserve your data?
  • What costs (if any) will your selected data repository or archive charge?
  • Have you budgeted the time and effort fo preparing the preservation and sharing of the data?
  • Example of Preservation:
  •  
  • "At the completion of project, the principal investigator will deposit the data and all documentation files in figshare to ensure long-term preservation and access. All data will be converted into open file format, e.g. .csv, .txt, .pdf, .wav, etc. Both the original and converted files will be deposited."

7. Data Sharing

Explains how you will share your data and specifies if there are foreseeable restrictions on sharing.

 Sharing

     Identifies the mechanism you will use to share your data.​

  • Who is the audience for your data?
  • How will potential users find your data?
  • With whom will you share the data, and under what conditions?
  • Will you share data via a repository, handle requests directly or use another mechanism?
  • When will you make the data available?
  • Will you pursue getting a persistent identifier for your data?
  • Example of Sharing:
  •  
  • "The research data can be reused by researchers working in [your research topic] area to generate new hypotheses to explain current findings. In addition, we also expect the data to be used by practitioners and policymakers.
  •  
  • The data will be deposited in figshare at the completion of the project. All items deposited in figshare will be assigned with a DataCite DOI as a persistent identifier. We will indicate in the publications generated from this project that interested researchers can access the data via figshare or via the DOI provided."

 Restrictions

States if there is any restriction or embargo period as a result of political, commercial or patent reasons.​

  • Are there any requirements for the data sharing system you chose?
  • What action will you take to overcome or minimize restrictions?
  • For how long do you need exclusive use of the data and why?  
  • Will a data sharing agreement (or equivalent) be required?
  • Example of Restrictions:
  •  
  • "Unless there is an embargo requirement imposed by the journal publisher, the data will be opened for reuse as soon as the manuscript created from this project has been accepted by the publisher."

8. Responsibilities & Resources

Describes different roles and responsibilities in the data management activities, as well as the financial resources required.

 Roles and Responsibilities

Considers who will be responsible for ensuring smooth implementation of the DMP.

  • Who is responsible for ensuring the DMP is enforced, reviewed and revised?
  • Who will be responsible for each data management activity?
  • How will the responsibilities be split across partner sites in collaborative research projects?
  • Will the data ownership and responsibilities for research data management be part of any consortium agreement or contract agreed between partners?
  • Example of Roles and Responsibilities:
  •  
  • "The principal investigator will be responsible for ensuring the DMP is enforced, reviewed and revised. He will also take the responsibility for the collection, management, back up, and sharing of the data.
  •  
  • Should the principal investigator leave PolyU or cannot carry out the responsibilities for whatever reason, Dr. Chan, co-author of this project, will assume leadership and responsibility in research data management of this project."

 Budget

     Justifies resources required for ongoing data management tasks.

  • Is additional specialist expertise (or training for existing staff) required?
  • Does the project require hardware or software which is additional or exceptional to existing institutional provision?
  • Will charges be applied by data repositories?
  • Any other costs that should be included in data acquisition, personnel time for data preparation and documentation?
  • Example of Budget:
  •  
  • "There will be no charges for the selected open data repository. Costs for long term data management will be borne by the University. Staff time for data acquisition, documentation, and archiving have already been allocated in the project budget."

References:

Corti, L. (2014). Managing and sharing research data : A guide to good practice. London: SAGE.

DataONE, Data Management Modules. https://www.dataone.org/education-modules.

DCC. (2013). Checklist for a Data Management Plan. v.4.0. Edinburgh: Digital Curation Centre. http://www.dcc.ac.uk/resources/data-management-plans/checklist. 

EDINA and Data Library, University of Edinburgh, Research Data MANTRA [online course], http://datalib.edina.ac.uk/mantra.