Disk pulling ahead in backup stakes

Disk pulling ahead in backup stakes

By David Braue

Last year, Gartner analyst Bob Passmore, addressing the crowd of attendees at the group's Planet Storage conference, declared the death of tape backup. "Tape will be unsuitable as a restore media" by 2008, he said, predicting that by then most data restores would come from disk instead of tape, as has been common practice for decades. Will he be proven correct? David Braue reports.

The debate about the future of disk versus tape has raged for years, with tape usually holding its own on cost reasons: prices for tape storage are still measured in the cents per gigabyte, while disks still cost many times as much. However, the introduction of low-speed, low-priced disk arrays built around conventional desktop ATA and Serial ATA interfaces has quickly brought down the cost of disk: during 2003, IDC notes, the overall average cost of disk per GB dropped 31.8 percent, from $US39.79 in 2002 to just $US27.13 last year.

Declining costs for disk-based storage have rapidly changed the dynamics of the disk-versus-tape argument, as customers increasingly introduce low-level disks as an intermediate storage level between server and tape. This strategy has completely changed thinking about the best way to manage data backup and archiving.

Super-sizing tape

ATA disk or no ATA disk, tape has had to do some serious growing up in recent years, thanks to the proliferation of storage area networks (SANs), which have encouraged organisations to centralise their data stores in order to reduce management overheads and eliminate potential version conflicts between data.

To support multi-terabyte storage arrays, manufacturers have had to continually push the envelope in terms of tape capacity and throughput. SAN-attached tape drives employ multiple drives to strip data from SAN storage at speeds fast enough to meet backup window requirements. Each successive generation of tapes doubles capacity and runs a fraction faster than previous models, aiming to keep up with disk capacities that c ontinue to grow with dizzying speed.

Today's tape technologies are led by Sony's SAIT-2, which holds 500GB of uncompressed data per tape (1.3TB compressed) and transfers data at 30MB/sec. That's several times the capacity and speed of a few years ago, when Quantum-developed SuperDLT ruled the roost.

But how far can tape go? Sony's fourth-generation SAIT-4 technology, likely to appear in the vicinity of 2010, is projected to store up to 4TB of uncompressed data, or 10.4TB of compressed data, per cartridge. Quantum projects its next-generation SDLT 1200 and 2400 drives will provide 640GB and 1.2TB native storage, respectively, with throughput surpassing 50MB/sec in the SDLT 1200 next year and 100MB/sec by the time the SDLT 2400 appears around 2007.

That may sound like a lot, but is it enough to keep up with insatiable corporate demand for disk-based storage? Not if analyst figures are correct. According to IDC, last year, Asia-Pacific customers alone snapped up over 79 PB of new disk storage, an increase of 38.1 percent over 2002.

Globally, IDC reported that the total volume of storage shipped increased 39.5 percent year-on-year during the first quarter of 2004, reaching 247 petabytes shipped in that quarter alone. Compare that with a year and a half earlier, when IDC reported quarterly shipments of just 144 PB. At this rate of growth, the market will purchase 581 PB of new storage during the first quarter of 2007-more than double current levels-by the time tape drives reach 100MB/sec. Two years later, the market will reach 1028PB shipped during the quarter, or more than 4 exabytes of new storage during that year alone.Storage-related growth curves are dizzying, and managing that storage has become a major priority for IT administrators struggling to keep backups running quickly enough to keep up. With growth projections suggesting even medium-sized companies are likely to be using multiple terabytes of disk by 2007, what seems like fancifully fast tape now may soon be struggling to keep up.

Getting smarter about backup

Customer concerns over tape strategies are reflected in the regular market surveys of storage giant StorageTek. Back in 2002, the company reported that reduced/eliminated backup windows were the second most important area of concern for IT managers, named by 51.8 percent of participants at a StorageTek seminar. Automated backup came in third, named by 45.9 percent of respondents.

Fast-forward two years, and priorities have changed considerably. The latest survey, launched in May, found that disaster recovery, data availability and storage on demand had pushed concerns over backup windows down the list of priorities; only 32.5 percent of respondents named backup windows as an issue in 2004, while automated backup was named as a priority by just 24.7 percent of respondents.

These changes in perception reflect shifting priorities amongst corporate IT decision-makers, who in turn are revising the agenda when it comes to backup strategies within their organisations. In their new vision, backup logistics have given way to issues of data availability and reliability-both of which are better served by disk-based storage systems. No wonder, then, that ATA and Serial ATA-based storage systems are becoming standard issue within all sorts of businesses.

Craig Tamlin, Australia-New Zealand country manager with SuperDLT developer Quantum, cautions that many companies looking to disk as a replacement backup medium will be sorely disappointed. "It's a false economy," he explains."People have this perception that disk is fast-and if you're prepared to pay the right amounts of money, they can be fast. But a lot of these disk products are very, very slow. Generally, they're not designed for backup. If you need to pull back an entire server, you'll find that the data rate coming back from tape will be much higher than that of low-cost SATA disk drives."

The unaddressed problem Tamlin refers to relates to the design of SATA devices, which he says "become I/O bound" because they're not designed to handle large blocks of information, such as streaming tape backups. Aiming to capture the best of both worlds, Quantum has built its DX30 and DX100 near-line storage devices as disk arrays that act like tape: the device appears to servers as just another tape drive, but runs a file system that's been optimised not for random-access retrieval (like conventional disk), but for large block sizes typical of a tape backup stream. The net result, he says, is to increase performance significantly over tape.

Backup's hidden costs

While presumptions that disk storage is faster than tape may be incorrect, disk does have one significant benefit: accessibility. Graham Penn, Asia-Pacific director of storage research with IDC, points out that there is more at issue in the disk versus tape battle than just speed: tape carries with it a far higher manual cost-often three to seven times the price of the equipment itself-that is often ignored during total cost of ownership calculations.

Consider the typical branch office, where data is backed up nightly by a staff member who is charged with remembering to insert the day's backup tape into the tape drive before leaving for home at night. These tapes are almost never tested in the morning, and may not even be good-but they are relied upon as the primary source of information.

In larger environments, similar problems become much bigger issues as the number of tapes to be tested and cycled increases linearly with the amount of data being backed up. Data restores are even more problematic, since large numbers of backup tapes must often be searched through to locate a single file that's been deleted by a user. IT managers can waste hours locating a single piece of information amongst terabytes of tape-based information, throwing off their schedules and forcing them to cut corners on other tasks."At the end of the day it's not the cost of the tape which is prohibitive," Penn explains.

"Labour is the hidden cost. In over 90 percent of cases, you're not doing a full bare-metal restore from tape; people use it for finding a version of a file that someone deleted yesterday. You've got to have a very systematic filing system to find the tape you need, or the time to restore can be ten times the time it took to back up the data in the first place."

The solution? According to Penn, the best approach is to store a recent full backup and subsequent incremental backups on a near-line disk array that's permanently online. That way, recent backups are always available, and users can even be given the power to request their own undeletes through an online interface. This saves storage administrators massive amounts of time.

In a scenario where disk is used to store recent backups, tape is pushed further down the food chain until it takes on a somewhat different role. Instead of being used to store ever-changing versions of the truth, tape becomes a method for archiving point-in-time copies of a variety of information. Unnecessary files may even be deleted from the near-line disk well before the data is moved to tape-ensuring that those backups that are archived to tape include the most relevant and important data possible.

Making this approach work, however, requires several other changes to storage management policy. Wherever possible, enterprise storage should be centralised onto a single SAN so that backups are always taken from the same, single source. Companies should use backup agents, installed on remote desktop and notebook PCs, to archive the often important work of executives whose backup discipline is typically less than perfect.Most importantly of all, disaster recovery plans need to be tested, tested, and tested again. Whether it's based on disk or tape, if your backup system isn't doing the right thing, it's worth nothing.

Related Article:

IBM and HP ready to unleash LTO3