Well preserved

Well preserved

By Laurie Varendorff

Jan 01, 2005: In 2003, the National Archives of Australia (NAA) established the Digital Recordkeeping Initiative. Laurie Varendorff, of the ARMA (Association of Records Managers and Administrators), talks to Dr Andrew Wilson, project manager, Managing Digital Records for Access (MADIRA), at the NAA, to find out what progress the NAA has achieved through the initiative in addressing and resolving the issues around the preservation of digital records.

The Australian Digital Recordkeeping Initiative was established as part of an ongoing programme which commenced back in the early 1990's with the release of the culmination of effort by numerous parties including the NAA, the State Records Authority of New South Wales (SRANSW) plus many other government agencies, private industry, professional associations and Standards Australia in the release of the then revolutionary Records Management Standards, AS4390.

Prior to the 1990's there had been increasing unease in the archive profession and its governing bodies located in State and Nation Archive institutions and professional associations nationally and internationally with the apparent inability to capture and preserve born electronic information in an electronic archive environment. This realisation became a cry for help and a call for action in the early 1990's by historians and those persons responsible for preserving the history of nations and lesser entities which go to make up the history of the nation.

With the introduction of very primitive personal computers such as the Apple 1 back in 1976 and, probably more significantly, the introduction of the IBM Personal PC in 1981, a vacuum in the capture and preservation of government, corporate and personal data, information and records created electronically occurred as the information was retained within these devices and died with them as they were recycled or sent to the tip.Dr Andrew Wilson, project manager, Managing Digital Records for Access (MADIRA), at the NAA (also know as the Archives), advises that the long-term preservation of preservation of electronic information is, of course, only one component of a comprehensive approach to managing digital records, but nevertheless it is one of the central issues faced by institutions responsible for preserving access to digital objects over time.

"It is important to remember that preservation fits within a broader framework of recordkeeping. So, the Archives' activities in the area of digital records preservation need to be seen in the wider context of a developing approach to managing digital records.

"As in any developing policy the Archives' position has changed over time. In the mid-1990s the Archives adopted a controversial policy of distributed custody for digital records. This policy reflected the Archives' view at the time that the best way to preserve electronic records of permanent archival value was to ensure that Australian Government agencies implemented best practice in electronic recordkeeping. The Archives' role was to enable the adoption of best practice recordkeeping through the setting of standards and provision of high quality advice. Throughout the mid to late 1990s the Archives' continued to be closely and actively involved in developments in electronic recordkeeping in the Australian and international recordkeeping communities."

The initial work carried out by the Archives with AS4390 continued into providing input into the creation of the international records management standard, ISO 15489, released internationally in September 2001.

In 1998, the Archives embarked on an ambitious research and development program which culminated in the release, in March 2000, of the e-permanence framework. This release also signalled a change in the Archives distributed custody policy, brought about by advances in the Archives' understanding of electronic records management issues and the increased availability of new technology. The new custody policy released in 2000 gave an in-principle undertaking by the Archives to accept custody of all electronic records that are appraised as having archival value. Dr. Wilson recalls how the change in custody policy initiated a development of a digital preservation program in the Archives.

"The first stage of the program, which commenced in mid-2001, was the development of a conceptual understanding of electronic records using previous insights of the Archives to do with the importance of information as evidence which is significant over time, not the form of the object. This work led to the 'essential performance' model which is fully described in the Archives green paper: 'An Approach to the Preservation of Digital Records (2002).'" (http://www.naa.gov.au/recordkeeping/er/digital_preservation/summary.html.)

The green paper outlines the approach which has been adopted by the Archives: records are accepted into NAA custody in a variety of formats and then converted ("normalised") into appropriate long-term formats which can be maintained and made accessible over time. Following the publication of the green paper, the Archives started work on developing tools to realise its vision and to enable it to implement the approach. The first of these, and probably the most significant, is Xena, an open-source software application for normalising and viewing digital records. The Xena source code is available from the open source software site, SourceFORGE, at: >http://sourceforge.net/projects/xena/.

"The digital preservation project is now in the process of being turned into an operational area within the Archives and the digital preservation process should be fully operational by the middle of 2005," reveals Dr Wilson. "Because of the very practical nature of the work being undertaken by NAA, many other archival institutions, both in Australia and overseas, have looked to the NAA products when developing their own approaches to digital recordkeeping and preservation. In recognition of this, in 2003 the Archives established the Australian Digital Recordkeeping Initiative, a coalition of the Australian and New Zealand archival institutions which aims to consolidate and further develop a common approach to digital recordkeeping, including preservation."

The Archives has not been alone in its quest to provide a platform for the capture and preservation of electronic information. Other parties, such as the Public Record Office Victoria (PROV) Victorian Electronic Records Strategy (VERS) within Australia, and internationally, in the National Archives & Records Administration (NARA) (US), with its interaction with the San Diego Supercomputer Centre's National Partnership for Advanced Computational Infrastructure, and the UK National Archives, are all attempting to provide a viable solution for the preservation of electronic data, information and records.

At the Association of Records Managers and Administrators (ARMA) 2004 Conference a question was put to Regan Moore, Ph.D. of the San Diego Supercomputer Centre's National Partnership for Advanced Computational Infrastructure, who gave a presentation at the event. Moore was asked to identify who he thought was correct in their approach on this matter - the National Archives & Records Administration (US) (NARA) and their proposed solution, the National Archives of Australia (NAA) and its solution or the Public Record Office Victoria (PROV) Victorian Electronic Records Strategy (VERS) and its solution, or some other solution altogether? The diplomatic answer provided by Regan Moore was, that only time will tell.

"I think Reagan's answer to the question at ARMA was not only diplomatic but essentially the only possible one," stresses Dr Wilson. "NARA does not yet have anything in the way of an explicit approach-that is what their two tenderers are in the process of developing. But there is nothing concrete that anyone can look at and say "this is the NARA approach." So I cannot make comments on the viability of their approach or otherwise, but I would note, in passing, that the scale of their digital preservation problem is so much larger than ours so what works for us might not work in the US context, and vice-versa. Nor is there any necessary reason to think that our different approaches (if indeed they do turn out to be different) should work in other jurisdictional contexts. So I want to make quite clear that I do not believe that it is appropriate for NAA to judge other approaches, but I'm perfectly happy to say why we adopted our own approach rather than other alternatives."

The NAA and VERS approaches are very similar in that they both focus on normalising records into XML-based archival formats, but while wishing to emphasise the similarity of the approaches, Dr Wilson is equally keen to emphasise that in his view, the Archives' approach is preferable to the VERS approach for a number of reasons.

"We think it's important to retain various features of digital records that are not possible to keep with the VERS formats (PDF, TIFF, and plaintext). Our "essential performance" model allows us this flexibility and means that we do not specify a limited range of acceptable data formats for transferred records, as the VERS approach does. Also, the federal government environment is different from the Victorian government environment, and means we could not mandate particular acceptable formats as the PROV is able to, even if they wanted to adopt such a mechanism for limiting transfer formats. The VERS approach is still in the early days of implementation (only 2 departments are currently VERS compliant), and our approach will not become operational until next year, so it is still a matter of seeing whether time proves that our approaches are viable."

Another approach that Dr Wilson has examined is the one adopted by the UK National Archives, whose current approach is to migrate records from the proprietary formats in which they are created into new versions of the formats as the previous ones become obsolete. But as he explains, it is highly unlikely that the Archives will follow a similar path.

"This is certainly an attractive solution in some regard but I do not believe that it would be a viable approach for the NAA in the long term. When we were developing their own conceptual approach we did examine other approaches, including migration. We came to the conclusion that migration was not an approach that we wanted to use for various reasons:

• Some attributes of the original record are lost during the conversion process resulting in a different performance, while our aim is to maintain as closely as possible the original performance;
• Since the formats are proprietary ones they do not have any control over what is or isn't lost in the conversion, unlike the case in the NAA approach;
• Migration requires significant resource commitments to
a cyclical process of converting records from obsolete formats (NB: This is set out in fuller detail in the green paper referred to previously in this article). This does not mean that the migration approach won't work in the UK context, or that the NAA do not approve of this approach. All I can say is that for a number of reasons that seemed convincing to us, the Archives does not want to follow this particular approach.

"In any case, I want to insist that none of us claim that they have "the solution". We only say that we have what we think is a viable approach to the problem of preserving digital records for the long term. It might prove to be the case that they (archival institutions dealing with the problem of long term preservation of digital objects) need to use all the approaches for different parts of their collections. It might eventually turn out that no one approach is enough to preserve all records for the long term."

Only time will tell

Dr Wilson advises that the solutions being researched and implemented at this time are designed to address electronic records created today and into the future, rather than attempting to capture and preserve all of the electronic records created since the 1970s."I don't believe that the approaches currently under development are being designed to address the issues of legacy electronic/digital records. These are quite a different matter from the digital records being created today.

"Custodial institutions will need to develop different strategies for dealing with such records. In my view, the gap between today's digital preservation capabilities and the ability to preserve legacy formats will always exist. There are a couple of reasons for this: the impermanence of the media used to carry and store electronic records, and the wide range of operating systems and software applications that were used.

"The first step that needs to happen in order to try to fill the gap is to transfer data from the legacy media onto modern media, such as CDs and DVDs. Not that either of these are archival storage media, but at least such a transfer will allow modern computers to access the legacy data. The second necessary step is to interpret the data and then to maintain this in an appropriate archival data format. A significant issue with legacy data is the disappearance of the hardware that is able to mount and read the legacy media-an urgent reason to attempt media conversion as soon as possible."

The NAA has been running a project to recover the legacy electronic records in its custody. This has involved working with a private sector data conversion company to firstly transfer data from the legacy media to high-quality CDs. The second step in this project is to attempt to read and interpret the data formats which have been recovered through the media conversion processes.

"The Archives has been pleasantly surprised by the very high success rates it has had with the first stage of the project-data recovery is in the order of 92 percent," notes Dr Wilson.

"We are about to begin the second stage of the project, attempting to read and normalise the recovered data, so we can't yet comment on the overall outcomes of the project. We are hopeful, however, that a high proportion of the data will be intelligible and that we will eventually be able to normalise it."

The Archives' experience may not, however, be typical.

"The quantity of legacy records in the custody the Archives is small, and the number of legacy media types we needed to convert was also small. The majority of the legacy media were various size floppy disks and these did not present much of a problem to transfer to new media. We do not have the problem facing a number of archival institutions and other collecting bodies, of having extremely large numbers (in the hundreds of thousands, if not millions) of magnetic tapes that need to be converted to modern media. In some cases the sheer quantity of the legacy media involved may mean that the conversion task is not physically possible within the remaining life of the magnetic material."

The rather gloomy prediction that the legacy data created between the 1970's and the present day may be lost to future generations if the data was not printed to paper and preserved, may in fact be true in some jurisdictions.

Related Article:

An uphill struggle

Business Solution: