Council fire shines light on DR

 A fiery disaster at Liverpool City Council has put the Australian local government sector on notice about the importance of business continuity and getting disaster recovery strategies right.

After a raging fire tore though the Liverpool City council chambers in western Sydney, council records and information management professionals arrived at work on Monday to find they were back to Ground Zero, with the Council headquarters completely destroyed, and asking themselves whether full recovery would be possible and how long it would take.

With the switchboard silent and Council Web site off-line, outside observers wondered how effective their plans had been to deal with such an unlikely disaster, and other councils across the state and country took an immediate reality check on their own preparedness.

It was impossible to contact Council IT professionals this week to learn how they were dealing with the disaster, with phone and email services unobtainable.

Liverpool City Council information technology manager Barry Dinham gave an eerie preview of the challenges of planning for such a situation at an industry forum in 2008.

Dinham explained the challenges of developing a critical response plan to ensure business continuity.
"It’s a tough sell because everything obviously is money related so in a council environment money is pegged at how much you can have. Getting more funds for business continuity is an issue so what we try and do is do it in a simplistic way. As long as we have the documentation and test it and for those times when we have a power outage that we report back to senior management and try and get the funds to stop it from occurring again

Dinham admitted that the Business Continuity plan was “a very low percentage of the IT budget, probably less than one percent

"My advice would be to make sure you have up to date documentation and that it's tested - even if it’s only a selective test - and that it has ownership from your senior management"

Mainstream media reports on the fire estimate the final damage bill could be in excess of $A20 million.

Labor mayor, Wendy Waller reassured the Sydney Morning Herald this week that "the council operated an off-site computer back-up system where all important records were stored.

"The important council documents and records have a back-up off site." she added.

Inspector Brad Harrison of the NSW Fire Brigades said "the fire spread rapidly because of the amount of paperwork in the open-plan office building."

The challenges of ensuring continuity in the event of a natural disaster are difficult, as one IT manager at a major urban council observed this week.

The IT Manager highlighted that while many documents arrive electronically or are registered as soon as they arrive, a council must by necessity work with large A0 size plans for development applications, as these are too unwieldy to view on screen

This particular metropolitan Sydney Council has an off-site tape backup storage facility and performs a full backup nightly of Terabytes of data. It has installed a HP Data Protector -based disk backup and recovery facility on-site, featuring full hardware redundancy, as part of its new virtualised VMware SAN environment, installed in 2009.
The IT Manager is looking to establish a fully replicated off-site facility connected by fibre link to ensure speedy recovery of mission critical systems within 48 hours in case of an incident such as the Liverpool fire. It will take somewhat longer off tape.

Some may be surprised that the Liverpool Council Web site was also taken down when the council chambers went up in smoke, but this IT Manager pointed out that Council Web sites these days are so tightly integrated with business systems that it’s necessary to host them internally, and having a separate external hosting arrangement is inflexible and not usual.

This Council looks at two separate objectives when it analyses business continuity for mission critical systems, its Recovery Point objective and Recovery Time objective: how much can be recovered and how soon?

"A council must look at every system it has, and with most councils today this involves very many," said the IT Manager.

"We must look at our rates and property subsystem, our financial systems, and ask how long we can last before restoring them in the event of a disaster like the Liverpool fire."

Paperwork is being reduced considerably in council operations with the rise of email as a standard communications platform, and even faxing these days is mostly handled electronically. Where ratepayers bring in paper documents, these are usually keyed into business systems at the counter, reducing the impact of accidental destruction.

Andy Carnahan, Manager Information Services at NSW's Wingecarribee Shire Council, is well aware of the challenges of disaster recovery.

“Its’ all well and good to pack the parachute, but the real challenge comes when you pull the cord in mid-air, “ notes Carnahan.”

“That’s when you find whether you’ve done the job right.”

Wingecarribee Shire has taken a pragmatic approach to DR.

“Two and 1/2 years ago we built a modest disaster recovery centre and spent time testing our ability to recover our critical systems. My view was to get our hands dirty and actually build the centre, so as to make the mistakes and have the expertise in house. We have recently virtualised our data centre and have a high speed microwave link to our DR site, two important components for at least a lukewarm centre. We really like VMs as the fabric of our disaster recovery parachute!,” said Carnahan.

“At present our key business systems could be running (ie the servers would be available) as at COB the previous day within one hour. However, our connections to the world would take longer and we would be chasing up PCs, switches, printers and other bits. Also the DR system would not have the horsepower to support large numbers, but so long as we have the VM guests in good shape, I am sure we could get experts in to ramp up performance.

“Of course, it is easy to say this, we have to set up a testing schedule to prove to ourselves and our Executive it is working and always available. I am a bit of sceptic when it comes to assurances. Nothing like logging into the DR server and seeing your environment is there. “