Skip to Main Content
PolyU Library

Research Data Management

Sharing good practices for Research Data Management


Data Documentation & Metadata

Data documentations are human-readable files and records explaining the content, structure, and meaning of data; while metadata are standardized, machine-readable fields that make the data discoverable and reusable.

They both provide information about the data, ensure your data is understandable, make future analysis and reuse possible, and thus increase the value of your research data.

It is always easier to create data documentation at the beginning of your research project and update it throughout the research process. Good data documentation usually explains:

  • Project-level:
    • research background and design, e.g. investigators, funders, research aims, hypothesis, etc.
    • data collection method.
    • structure of data files.
    • procedure for data cleaning and other quality assurance measures adopted.
    • version of the dataset and modifications made.
    • source of secondary data used, if any.
    • reuse license.
    • related publications and other research outputs.
  • Variable-level:
    • definition of the parameters.
    • unit of measurements.
    • format for data, time and other parameters.
    • code values, e.g. 1=female; 2=male, etc.
    • code for missing values.
    • corresponding question number.

Depending on the nature of the research and data collection method, data documentation can be recorded in different forms like codebook, data dictionary, laboratory notebook, read-me file, dairy, etc. They all share the same goals - to ensure your research data can be understood by current and future researchers who would like to make use of the data again, including yourself!

Next: Metadata >>

Metadata means data about data. It provides a structured way to describe the datasets in a standardized manner. This allows different computers to interpret the contents automatically which facilitates interoperability among different systems. Below are the common elements of metadata:

Types Functions


Descriptive Metadata
Enables discovery, indexing, and retrieval.

Technical Metadata
Describes how a dataset was produced and structured.

Administrative Metadata
Describes user rights and management of the dataset.
  • Reuse Rights & License
  • Access information like restrictions and embargo period

Metadata Standards

Metadata can be recorded in a variety of formats like text documents, HTML, or XML. An example of a widely used metadata standard for generic research data is Dublin Core (DC). You can also make use of the following tools to identify the common metadata standards used in your subject areas:

<< Previous: Data Documentation