Skip to Main Content

Health Sciences Research Data Management: Preservation

Storage ≠ Preservation

Just because you properly name your files and have stored them on your hard drive, a cloud drive, and a USB does NOT mean that you have preserved that data for use and accessibility in the future.

Hardware storage devices can become obsolete as technology evolves. Remember CD's, floppy disks, and VHS tapes? USB's are great right now, but what will replace them? Software storage options can also pose a problem, especially if you are using proprietary software that can only be used in a particular lab or setting.

Saving your data using current, open file formats allows your data to be accessed over time and platforms.

Preservation Metadata

Preservation metadata is a specific type of metadata that describes a digital item in terms of both context and structure. Its goal is to maintain a digital objects viability while ensuring continued access through providing contextual information as well as details on usage and rights. Preservation metadata is an essential part of the data lifecycle; helping to document a digital objects authenticity while maintaining usability across formats.

One of the main ways that preservation metadata differs from metadata is that it is external metadata that stores technical details on the format, structure and use, and history of all actions performed on a resource. These could include changes and decisions regarding digitization, migration to other formats, custody history, rights and responsibilities information, and physical condition of a resource.

Preservation metadata is access-centered and should accomplish four goals

  1. Include details about files and instructions for use
  2. Document all updates or actions that have been performed on an object
  3. Show provenance and demonstrate current and future custody
  4. List details on the individual(s) who are responsible for the preservation of the object and changes made to it

 

Preservation metadata often includes:

  1. Provenance:  Who has had custody or ownership of the digital object?
  2. Authenticity:  Is the digital object what it purports to be?
  3. Preservation activity:  What has been done to preserve the digital object?
  4. Technical environment:  What is needed to renter, interact with, and use the digital object?
  5. Rights management:  What intellectual property rights must be observed?

Preservation Considerations

When you are collecting your data, you may need to use specific hardware or software that is proprietary or specific to your project. However, you should plan ahead as to how you will transform that data into accessible formats for long-term storage and preservation.

Some things to consider include:

  • What do you need to keep?
  • What are the journal and/or funder requirements?
  • Who will be responsible for the data throughout and at the end of the project?
  • Do you have sufficient documentation so that future researchers can access and use your data?
  • Are the file formats you've chosen open and sustainable? If not, how can you make them so?
  • If you are depositing into a repository, what is the shelf-life of the hardware and when will data need to be transferred?

PREMIS

The PREMIS (PREservation Metadata:  Implementation Strategies) Standard is the international standard for metadata to support the preservation of digital objects and ensure their long-term usability. The PREMIS Data Dictionary is organized around a data model consisting of five entities associated with the digital preservation process:

The PREMIS Data Dictionary defines semantic units; each of which is mapped into an entity that is organized within a simple data model (semantic unit = property of an entity). The PREMIS Data Model identifies four entities important to digital preservation:

Object (or Digital Object): a discrete unit of information subject to digital preservation. 5 Version 3 introduces the notion that this can be an environment used as part of the preservation process.

Environment: Technology (software or hardware) supporting a Digital Object in some way (e.g. rendering or execution). Environments can be described as Intellectual Entities and captured and preserved in the preservation repository as Representations, Files and/or Bitstreams.

Event: an action that involves or affects at least one Object or Agent associated with or known by the preservation repository.

Agent: person, organization, or software program/system associated with Events in the life of an Object, or with Rights attached to an Object. It can also be related to an environment Object that acts as an Agent.

Rights Statement: assertion of one or more Rights or permissions pertaining to an Object and/or Agent.

Preservation Metadata is relatively new and still evolving. For the complete PREMIS Data Dictionary, follow this link.

(LOC PREMIS Data Dictionary, 2015)

Top Bottom