What’s happened to the VERS standard?

By Andrew Waugh

Adherence to the influential Victorian Electronic Records Strategy (VERS) standard is an important yardstick for assessing Electronic Document and Records Systems (EDRMS) beyond the borders of the southern state. The Public Record Office Victoria (PROV) has just finished its first major revision of the VERS standard since 2002, acknowledging government’s need for increased flexibility in where digital records are kept and how they are represented.

The Public Record Office Victoria (PROV) has just finished revising the standard that underpins the Victorian Electronic Records Strategy (VERS). The new standard:

  • allows far greater flexibility in representing digital records,
  • is more efficient when including large binary objects (such as videos and databases)
  • provides more format options to reduce the costs of format conversion, and
  • is much shorter and easier to understand.

From a user/implementers point of view these are significant improvements as they lower costs, increase usability and provide more flexibility.

We have released the new standard now so that vendors and users can become familiar with it. However, it is very important to note that PROV will not accept records in the new VERS standard until the new digital archive is commissioned (around the middle of 2018). Agencies can use the new long term preservation formats immediately as these have been added to the old VERS standard. To protect agency and vendor investment in the existing VERS standard, PROV has committed to continue accepting records in the old VERS format.

The Public Record Office Victoria (PROV) developed the Victorian Electronic Records Strategy (VERS) in 1998 to manage the preservation of digital records in government and in the archives. As part of this work a ‘Standard for the Management of Electronic Records’ (PROS 99/007), also known as the VERS standard, was developed. The standard was revised in 2002 as part of the pro- cess of building PROV’s digital archive.

Why renew the VERS standard?

In digital terms, 1998, and even 2002, is a long time ago. Digital preservation knowledge and technology has moved on a

long way in that time. But more importantly, experience with implementing VERS and transferring records in the VERS format showed that we could improve the standard, and we could improve it a lot.

The problems we identified were:

  • The standard was too tightly linked to EDRM systems of the 1980/90s. Records were structured into files organised in a classification scheme. The record-keeping functionality required was modelled on EDRM systems. The metadata was focussed on record-keeping metadata. But digital records are held in many systems within government, many of which bear little resemblance to a traditional EDRMS. Even modern EDRMS are more flexible than the simple models supported by VERS.
  • Inefficient storage of large binary records (such as digitised paper records, databases, videos and geospatial files).
  • Migration of record content to an accepted long term preservation formats was often expensive. The expense was particularly hard to justify for formats that were unlikely to cause preservation problems.
  • The size and complexity of the standard made it very difficult for people to understand and implement.
  • Poor use of metadata. Much of the metadata provided by the standard was never used. It was also difficult to add non-record metadata (e.g. to support graphical information system data and digitisation).

In renewing the standard we addressed each of these problems.

The original preservation VERS vision, however, has not changed. Digital preservation in VERS continues to be based on:

  • Migration of record content to a long term preservation format.
  • Capture of appropriate metadata that describes the record and allows access and reuse.
  • Encapsulation of the record content and metadata into a single object to allow efficient management of the record over time
  • Securing the record by means of digital signatures to detect corruption of the content or metadata.

Flexibility in representing records

A key change in the renewed standard is the increased flexibility in representing records. This has two aspects:

  • organising the information in a record; and
  • metadata associated with the record.

The renewed standard removes the restriction that a record must consist of a simple fixed tree of file, record and document. Instead, records can be represented by an arbitrary tree – which can be as deep or as shallow as necessary. This allows great flexibility. For example, a single VERS encapsulated object could contain a single document, an unstructured series of documents, a single record, a file with all the contained records, a branch of a classification scheme with all the files and records, a complete record system or a SIARD capture, screen shots, and database documentation.

This revised standard allows a records manager to use any convenient elements in the structure.

The fixed set of metadata used in the original standard has also been discarded. Instead, multiple packages of metadata can be associated with objects in the tree. Each package represents a collection of metadata, typically the metadata from one meta- data standard. Examples of packages could be record metadata, graphical information system metadata, and digitisation meta- data. It is up to the records manager to decide what packages of metadata make sense to include. The only restriction in the standard is that the metadata must be expressed as XML, and, ideally, as RDF (the W3C standard for representing metadata).

To ensure that every record has at least some standardised information associated with it, the new standard has one metadata requirement. Each VERS record must contain one of two standard metadata packages: AGLS or AS/NZS 5478.

AGLS is a simple and widely used Australian metadata standard that is normally used to describe web resources. We have added an element to AGLS that describes disposal of the record. AGLS should be used where only a simple description of the record is available.

AS/NZS 5478 is a new Australian/New Zealand standard for record-keeping metadata. It is based on, and almost identical to, the existing NAA and Archives New Zealand record-keeping metadata standards. AS/NZS 5478 should be used when an extended record description is available, such as when the description is sourced from an EDRMS.

A major goal of the renewal was to improve the handling of large binary objects (e.g. video and databases). In the previous standard, all binary record content had to be encoded using Base64 before being included within the XML document representing the record. This increased the size of the record content by a third – making already large objects even larger.

To avoid this, we abandoned the use of a single XML document to encapsulate the record. Instead, the encapsulated object is a ZIP file containing the record content and XML files representing the record structure, metadata and digital signatures.

ZIP was chosen as the encapsulation format as it will handle extremely large file sizes, supports compression, and is extremely widely deployed. The specific encapsulation mechanism used is not a critical decision as it is very easy to re-encapsulate the records if ever that becomes necessary.

Long term preservation formats

A key plank of the VERS strategy is that record content will only be accepted in an approved long term preservation format. The formats are chosen because they are not expected to become obsolete for a very long time. The number of formats accepted is limited to reduce the long term cost of maintaining access to the formats. Record content that is not in one of these approved formats must be migrated.

The list of approved long term preservation formats has been ex- tended in the renewed standard. Some of the new formats deal with types of information that had not previously been covered, such as web archives.

However, some of the new formats were added to reduce the cost to agencies in preparing records. Migrating to a long term preservation format is challenging. Properly done, migration is expensive as it requires obtaining migration software, pretesting for conversion accuracy, running the migration, and finally post migration auditing of the converted files. It is important to minimise migration, particularly migration from formats that are unlikely to have long term preservation issues.

Ubiquitous formats that dominate their market segment are unlikely to cause preservation issues for the foreseeable future. Examples of such formats are the core Microsoft Office formats (Word, Excel, and PowerPoint), and MP3 sound files. Products in these market segments must support these ubiquitous formats to be viable – the body of legacy content is simply too great for anyone to use a product that does not support them. It is consequently extremely unlikely that these formats will become obsolete within the foreseeable future.

And even if the formats do become obsolete, the body of legacy content means that format converters will be a viable product. For these reasons, PROV had added several such ubiquitous formats to the acceptable long term preservation format list, including the core Microsoft Office formats Word, Excel and PowerPoint. Agencies and vendors using the previous version of the VERS standard can use this expanded set of long term formats as we have revised PROS 99/007 Specification 4 (Long Term Preservation Formats).

Simplification

We worked hard to reduce the length and complexity of the previous VERS standard. This was challenging because we also wanted to make the standard more powerful and flexible.

Significant simplifications were made by:

  • cutting out features and metadata that were rarely used
  • referring to external specifications and standards (such as ZIP and the two metadata standards)
  • describing the process of constructing a VEO, rather than its specification.

The size of the standard, specifications, and supporting advices has been reduced from 401 pages to 64.

Victorian agencies can use the renewed standard for internal archival purposes immediately. However, PROV will not be capable of accepting records constructed by the new standard until 2018 when our digital archive is redeveloped. The renewed standard was released early to assist vendors and agencies to understand it. Agencies can use the additional long term preservation formats immediately as we have updated the previous version of the standard to include them.

Agencies should continue to use the previous VERS standard for transfers to PROV until the new digital archive is commissioned.

Agencies and vendors have already made significant investment in VERS. For this reason, PROV will continue to accept records from systems that have been certified as compliant with the previous version of VERS.

The renewed VERS standard has been released and is available from the PROV website (http://prov.vic.gov.au/government/vers/ implementing-vers/standard-2). The web page also contains tools that will create and validate the new VEOs. These tools are written in Java, and are can be used by vendors and agencies under the CC-BY license.

Andrew Waugh is Senior Manager Standards & Policy at Public Record Office Victoria (PROV).

The challenge for format obsolescence