How will you read your data 100 years from now?
How will you read your data 100 years from now?
Taking the light approach to long-term storage and retrieval of information.
PLANNING to make information available in 100 years' time could mean investigating the most advanced storage technologies currently available, or looking at an alternative that will be virtually technology independent. The National Archives of Australia recently faced this dilemma after it was decided for the first time to make Census data available to the public after a century. Of course, it required the consent of the people completing the 2001 Census forms and it seems that Australians are a fairly unashamed bunch, as over 52% of respondents ticked "yes" to that option, accounting for some 10 million Australians.
So by the year 2102, the Australians alive at that time will be provided with an unprecedented ability to learn about their ancestors. We can only imagine what search and retrieval systems will be in use in 100 years, but the original information must be stored in a format that is stable and relatively accessible, while offering the benefits of modern storage technologies, such as compactness.
The answer was microfilm, a technology that has been in use for over 100 years. "We were looking for a material that is technology-independent and will last the time," said Dr Stephen Ellis, the director of preservations at the National Archives of Australia (NAA).
"The best part of microfilm is all you need to read it is a light source and a magnifying glass. So you can be assured of these being around," said Chu Teoh, a consultant who helped with the project.
It would seem that a project of this scale is likely to be popular among the citizens of the time. A similar project was recently completed in the United Kingdom, except the British Public Records Office undertook a massive backfile conversion project, whereby they converted Census forms from 1901 into a format that could be searched on the Web. The designers of the UK Census Web site built a system that would be capable of handling 1.2 million users per day - it crashed three days later due to the sheer weight of enquiries.
So it would seem that the Australian Government's "gift" to the people of Australia - as part of the Centenary of Australia celebrations - may be well-received in the future.
With the growth of electronic documents and as online data storage technologies evolve at an incredible pace to keep up with the demands of modern computer systems, it could be asked just how relevant is microfilm to modern information management?
The answer depends on how long the organisation wants to retain the information. Don Beggs, the managing director of DataComIT, the Australian company which converted the 2001 Census forms to microfilm for the Australian Bureau of Statistics and NAA, pointed out that NASA had failed to save the earliest space flight data on a format that would last. Less than 50 years later, that data is now lost.
Microfilm was ultimately chosen "because of the requirement to keep the archives for 100 years without access for the public or the government," said Dr Ellis.
"There have been horrific stories of paper turning to mush," said Paul Lowe, the executive director of the data processing centre at the Australian Bureau of Statistics (ABS) Census Project. The census archive will form 23 million images on 1422 roles of microfilm charting the shape of Australia in 2001.
"They believe that microfilm lasts. They will be checking it periodically," Mr Lowe said, adding that the archives were unsure as to what technology would be around in 99 years' time, while microfilm requires nothing more than a light source and magnification to access the data.
"Microfilm has been fairly clearly established as the most stable record keeping material we have," Dr Ellis said. "There are a lot of examples in North America and Europe of keeping microfilm copies of forms. In the UK they have held their forms on microfilm for the last 100 years."
Two archives of microfilm have been produced for the archives by ABS project partner DataComIT, which won the contract on an open tender. The ABS had consulted with Kodak on committing the forms to microfilm.
"Each roll has the responses of 7000 people, and we produced 2844 rolls," Mr Lowe said. They are preserved as duplicate sets of 1422 rolls.
Garry Bain, the DataComIT operations manager for the project explained that each roll contains 15,000 pages of the census forms. They had planned for 10,000, but he said the ABS was efficient in its scanning and creation of the TIFF files, so more would fit onto each role.
"All the TIFF files are sorted into a list file. The archive writer then photographs each file onto the film," Mr Bain said of the Kodak system which is archiving the national knowledge.
"Part of the new technology we used meant that we imaged the forms and we processed off these images [the archive] as it is more efficient. That meant we could transfer this to the PC and onto the digital archive writer," Mr Lowe said. The images for the archive were added to the digital archive PC on a cradle, as the computers were not networked for security reasons, Mr Bain explained.
During the eight month conversion process the National Archives had an independent laboratory company from Sydney test the microfilms every day.
"The National Archives have very high levels of standards and insisted on methlene-blue testing every day," Mr Teoh said.
Creating the national archive of census forms was not a simple process of taking the imaged forms and capturing them on microfilm. Dr Ellis explained that as each form had the possibility of being completed by up to six different people, just one or two may have opted for the forms to be retained for the archive. Thus the ABS had to cut the documents into sections on their Kodak and IBM based system and prepare them for the National Archives. Dr Ellis said that in other countries it is Ôall or nothing' and the entire form gets archived, not just the individual respondents. He believes this was a first for any such census project and was a success.
Two different locations will house the two copies of the archive in temperature controlled vaults. These locations are not for disclosure to the public. While the microfilms are in storage they will be checked annually.
"The major threat tends to be the growth of mould from humidity, but there are climates ideal for storage within Australia," Dr Ellis said.
Because the records are not stored on a computer, Dr Ellis believes there is a perception among the Australian public that the security of the archives is stronger. Traditional microfilm readers will be retained by the archives to annually check the microfilms and Dr Ellis is confident that there will be technology to convert film to what ever standards we choose in 2102, just as there are scanners to convert microfilm to today's media.
The National Archives was dissuaded from adopting an electronic format because of the leaps and bounds in technology.
"If you keep it in electronic form, every few years the technology has to be refreshed," said Mr Teoh, adding, "The cost of doing that would be phenomenal." The ABS is currently in the process of destroying all the paper documents from the 2001 census, as well as the electronic records and back up tapes, which was all part of the guarantee that the government offered the public.
Putting the census forms onto microfilm is a sign of the times for the format. Mr Beggs said companies looking for a micro-film conversion tend to be looking to have highly important records preserved for large amounts of time. He said there is not a huge amount of film work on the market at present, and little if any for active use. "Maintenance manuals for aircraft, manuals for critical systems and long term records of insurance documents are popular," Mr Beggs said.
"It's the secure backup disaster recovery," he said. Banks and industries with long term technical or mechanical assets, government and financial institutions are regular users of the the microfilm format.
If the documents have a critical long-term value, then microfilm's inherent longevity and reliability make it the most attractive long-term approach for preserving information with a guarantee that it can still be accessed in the next century.