Skip to Main Content
Kornhauser Health Sciences Library

Health Sciences Research Data Management: Data Capture & Documentation

Data Capture

How you record your data should be clear from the outset of your research project and consistent to its conclusion. It is imperative to develop and maintain consistent data capture procedures that clearly define steps to be taken and outlines the roles and responsibilities of all members of the research team. Though not required, developing a standard operating procedure manual will help to ensure consistency over the life of your project. Standard operating procedures should include at minimum experimental set up, when to create documentation, where data will be stored, and how files should be named.

Metadata

Metadata is the who, what, when, where, why, and how the dataset was generated. Consistent recording of metadata is one of the best ways to ensure that your data is discoverable and usable to you and your team now and to researchers in the future.

Types of Metadata

The National Information Standards Organization (NISO) has a comprehensive PDF on all things metadata.

They list and discuss the four types of metadata:

Descriptive:  for finding or understanding a resource

Administrative (Technical, Preservation, & Rights)for decoding and rendering files; long-term management of files; and intellectual property rights attached to content

Structural:  Relationships of parts of resources to one another

Markup languages:  Integrates metadata and flags for other structural or semantic features within content

Documentation

The Research Data Alliance contains a community-maintained list of Disciplinary Metadata Standards. Though metadata standards specify what information should be collected, documentation needs will vary by project and discipline. At minimum, metadata collected should include:

  • Title
  • Creator (names and addresses of data creators)
  • Identifier
  • Funder
  • Intellectual property or licensing rights
  • Access information
  • Language(s)
  • Dates
  • Project description
  • How the data was generated (methodology)
  • Data structure
  • Variable names and other data-level documentation (can be included in a README.txt file)
  • Data citation