NZ Library Fights Digital Dark Age With Web Crawler

NZ Library Fights Digital Dark Age With Web Crawler

September 26th, 2006: New Zealand’s National Library Te Puna Matauranga o Aotearoa is doing more than its share of work helping to prevent the digital dark age by helping to develop an open source web crawler to save vast tracks of cyberspace for posterity.

The web harvesting system, otherwise known as the web curator tool, was created in cooperation with The British Library and TelstraClear’s Sytec to enabled inexperienced hands to collect large volumes of information off the internet and save it for future reference.

The web crawler takes snapshots of websites, saving documents, images and files in a digital archive as it makes its way through specified sections of the internet. The National Library says that the crawler is not only auditable, but also can identify content for archiving and manage it.

The crawler is part of a larger global effort lead by the International Internet Preservation Consortium to save our digital heritage before large swathes of it is lost as technology breaks down and becomes obsolete.

The National Library, however, also has more down to earth legal reasons for participating in the tool’s development. A parliamentary Act in 2003 requires the Library to collect, preserve and make accessible digital collections alongside traditional paper collections. A commonsense ruling that is indicative of the direction laws surrounding digital information are heading.

As increasing amounts of our data, our history and our lives are stored and transmitted digitally it is supremely important that methods and technologies exist to gather this information and store it for legal, social and historical purposes.

NZ ICT publication m-net.net.nz says that The National Library of New Zealand and The British Library are currently integrating the web curator tool into their digital preservation programmes, with a view to release the as an open source product before the end of the year.

Comment on this story