24 September, 1997
Disaster avoidance -- because it will happen
By Matthew JC. Powell, Tom Duffy, Peter Young, Heather Mackey, Lincoln
Spector, Michael Lasky, and Scott Spanbauer
|
Feature |
SYDNEY
24/9/97
Parts of this article are excerpted from "How to keep your PC trouble free",
which will appear in the November 1997 issue of Australian PC World, on sale from
24 October 1997
They don't make disasters like they used to. Once upon a time plagues of locusts,
earthquakes, fires and floods were the only things you really had to worry about. In the
information age, a voltage spike here, a software glitch there, even a handy utility you
download from the Web can all lead to catastrophes of Old Testament proportions
No matter how paranoid - I mean, careful - you are, it's likely that something, somehow,
will go wrong with your computer at some point. There's no guaranteed way to avoid
glitches, but there are things you can do to minimise their impact.
Case in point 1
At the St James chemical plant near Baton Rouge, a hairy crash in September 1995 broke
through all the barriers that had been erected to protect data.
The network administrator was on holiday (of course), and the usual backups had not been
performed for a week. But for an IS team with open eyes, a failure is seized as an
opportunity for intelligent failure analysis.
When the IS team finally cleared the brambles away from its recovery system, it figured
out what types of problems were causing crashes. Thanks to a new approach to data
protection, the plant's Novell network is now protected by a backup plan that IS site
coordinator, Brian Durham, calls "almost too good to be true".
At St James, production engineers monitor the process of making styrene, constantly
checking for ways to increase yield and decrease cost. Modifications to the styrene recipe
go into the plant's computers, along with storage-gobbling graphics files and AutoCAD
drawings. Also on the network are the typical files and software found in any office: the
spreadsheets and financial software and the mail and post office used by the plant's
approximately 130 employees and various outside contractors.
When the network went down last September, it held all this information hostage for two
full working days. At the end of it, after analysing what went wrong, the St James team
came up with some sobering answers. Though it had been careful to provide hardware backup
- using redundant servers and Cheyenne's ArcServe - the system had crashed anyway because
of a faulty software write to the SCSI card.
"It wiped out the Novell partitions on the network drive," Durham says.
In the end, it cost them more than $US100,000 to find out something that's particularly
true in today's distributed environments: hardware solutions do not protect against
software failures.
Something had to be done. Soon after the crash, Durham happened to look at some
promotional literature about LANtegrity, a novel backup solution from Network Integrity.
Durham examined the brochure and remembers being sceptical simply because it sounded too
perfect. But the LANtegrity solution was so inexpensive, it was worth a gamble.
Aside from the licensing fee, the only real outlay was for an automatic tape loader and
compressed tapes. The system did require a server, but because Durham had been mirroring
each active server with a backup, he had servers to spare.
Unlike traditional backup systems where a complete copy of all files is usually made
weekly with incremental backups during the week, LANtegrity only copies files that have
changed. (A complete copy of all files is made during installation.) Once the system has
an image of a file, it never has to protect it again.
Fault-resilience is built in - if a server does go down, the LANtegrity server steps in
for it and automatically becomes that server.
Backup
Another key difference is that backup is performed continuously. The LANtegrity server
acts as a single user on the network. When a file's time-date stamp or archive flag
indicates that it's changed, that file goes into a queue. NetWare's Storage Management
Services processes it just like any other user's request and sends it over to the
LANtegrity server when traffic permits.
The LANtegrity server keeps these active files in cache, and it also automatically writes
them to tape for archiving and off-site backup purposes.
"It's not doing all the data all at once," explains Tim Millunzi, Network
Integrity technical support manager. "So you're no longer trying to artificially
compress a backup in some fixed length of time."
The LANtegrity system kept a low profile even as Durham added increasingly more servers
and storage.
"The entire plant had four gigs available when I came here," Durham says.
"Now we have 100, of which we're using about 60. I took spare equipment and made
servers out of all of them. I was able to put three systems online for the price of
one."
One reason for this increase was a desire to protect valuable data for (and perhaps from)
desktop users.
"We took a lot of the things people were using on workstations that were unprotected,
and we put it onto the sys- tem where it would be backed up and protected," Durham
says.
One LANtegrity server (a Compaq ProLiant 2000 P 200 with 512Mb of RAM and 22Gb of drive
space) protects the main file and application server (an identical ProLiant 2000), two
Compaq ProSignias (66MHz 486es with 256Mb of RAM and 22Gb drive spaces), and a ProLiant
4500 (a 66MHz Pentium with 512Mb of RAM and a 17Gb drive space).
The additional servers are used for storing AutoCAD drawings, an electronic document
management systems library, an Oracle database, and .AVI files.
It wasn't long before the new system was put to the test: three failures during the month
after installation. For the first one, Durham was on holiday (Murphy's laws in IS are
anything if not consistent), and the LANtegrity system had to be manually told to step in.
Now the system is configured to step in automatically.
"Since then, we have had a total of eight failures," Durham says, "and
users typically don't even know that we've had a failure. [LANtegrity] instantly takes
over. And because we're running Windows 95 on all our workstations, Windows 95
automatically reconnects by itself as soon as that server becomes visible again."
Case in point 2
When the main server in the Brazil office of Young & Rubicam Advertising crashed late
one morning in December 1996, it could have been a catastrophe. Instead, it set in motion
some well-detailed plans.
A download of the company's Lotus Notes application, which the ad agency depends on for
its creative work, media plans and strategy, was immediately initiated from its New York
office via its WAN. By the end of business that day, the Brazil system was operational.
All the while, the ad agency's data was well protected by four levels of redundant backup.
"From a standpoint of data, we didn't lose anything," said David Gutierrez,
Young & Rubicam's vice president/regional technology officer for the Southern
Hemisphere, and the man charged with protecting client data in the increasingly
competitive market of Latin America.
When operations go international, so do concerns about security. And global companies
don't just worry about server crashes and natural disasters. With worldwide threats such
as industrial espionage, they need to consider what Kathleen Harvey, senior information
security analyst at Datapro Information Services Group, calls "global risk". She
said the key is to create, as Young & Rubicam did, a consistent policy across the
entire organisation - not an easily achieved goal.
The key word here is consistency. "If you're an attacker, you'll look for the weakest
link," said Jackie Hyde, an information security analyst at Datapro in the UK.
Datapro conducted a survey on global security, which included 1342 respondents from the
US, Canada, Central and South America, Europe and the Asia-Pacific region.
One weak link can lead to hefty losses, especially with the increasing trend among global
companies to consolidate data centres from hundreds worldwide to the double and even
single digits.
In support of this coordinated single-policy approach, large disaster recovery service
providers such as IBM and Comdisco recently announced global business recovery services.
That means companies can put their entire organisation under one umbrella policy rather
than contracting on a regional basis.
The most progressive organisations, according to Datapro, are setting up a small central
security team at headquarters and appointing a person responsible for security within each
business unit around the world. The central team, headed by a corporate security manager,
conducts a risk analysis for the entire organisation and then selects a method-ology to
use around the world.
A good illustration of a global policy with local controls is found at Telstra, here in
Australia.
"We try to work to a collective security model which is adapted to prevailing local
conditions and circumstances," said David Harris, Telstra's general manager for
corporate security, which has operations and joint ventures spanning Asia, Europe and
North America.
"We have people with skill sets in specialised areas like security who form a
centralised resource that can be drawn on, but we also need the experience of the country
manager."
Key to this approach is communication between the policy makers and the policy
implementers, said John Clark, director of Andersen Consulting's information security
practice. "I've seen cases firsthand where companies have a central security group in
one country, and they distribute these policies to other countries that have not
necessarily bought off on those policies," he said.
Another complication is differing regional attitudes about the importance of security.
Even at Telstra, awareness of hacker intrusion is far less than in the US, company
officials said. This makes it more difficult to get employees to focus on the problem.
But it's clear that security can't remain on the back burner. "We have not yet seen
reports of global disasters - you know, transnational computer breakdowns," author
Roche said.
"But with the rise of distributed processing and global telecommunications networking
resulting in more and more dependence on international telecommunications circuits, we're
bound to see this type of thing occur more."
Antivirus software
Every parent knows that when the little ones go off to preschool, they're bound to come
home with the flu. The same is true of the Internet. As soon as you're sharing data and
programs with millions of users online, your computer's chances of coming down with a
virus increase a hundredfold. You need an up-to-date antivirus package, and you need to
use it correctly. Here's how:
1) Buy an antivirus program. Look for one that operates in the background, checking files
as you work. It should be easy to set up, use, and - very importantly - update. Dr
Solomon's AntiVirus Toolkit is highly regarded by virtue of its ability to detect a large
number of viruses, and remove 89 per cent of them. Symantec's Norton AntiVirus also
detects as many viruses but is able to remove only 77 per cent of them. It's simple to
use, versatile, and amazingly easy to update - you just click a button and it updates
itself over the Net.
2) Run it in the background. This is the best way to use an antivirus program because it
stays out of your way.
3) Update virus definitions at least monthly. New viruses are cropping up all the time. If
you want to stay clear of them, you need to get regular updates from the folks who made
your software.
Dr Solomon's Software
Tel (03) 9690 0455ÊFax (03) 9690 0455
INFO: www.drsolomon.com
Symantec
Tel (02) 9850 1000ÊFax (02) 9850 1001
INFO: www.symantec.com
McAfee (The Paradigm Agency)
Tel (02) 9437 5866ÊFax (02) 9439 5166
INFO: www.mcafee.com
Solid fix-it software
Windows 95 comes with a handful of serviceable utilities for keeping the hard drive
healthy, but you only get what you pay for. To be absolutely sure, users should shell out
for a good commercial utility package. There are two top-flight candidates for this job:
Symantec's long-established Norton Utilities and Helix's newcomer Nuts & Bolts.
What separates them? Only price, with Nuts & Bolts being the dearer of the two.
Helix's package also sports a better user interface, with clear dialogue boxes and
excellent tools.
Both packages test and fix a hard drive more thoroughly than does Windows' ScanDisk. For
instance, they check the partition table and the boot sector for errors that can render
your drive inaccessible - things that ScanDisk ignores.
Symantec's and Helix's defraggers are also faster and considerably safer because to avoid
errors they compare the file fragments to the originals as they move them. They're also
true 32-bit programs, while ScanDisk and Disk Defragmenter are old-fashioned, 16-bit
tools. A 16-bit program is more prone to crashing, and if there's anything you don't want
to crash, it's a disk scanner or defragger.
Norton solutions
Price: Norton Utilities 2.0 for Windows 95 $129 RRP. Norton AntiVirus 2.0 $89.99
Symantec
Tel (02) 9850 1000ÊFax (02) 9850 1001
INFO: www.symantec.com
Nuts & Bolts solutions
Price: $149 RRP
Light Years Ahead
Tel (02) 9477 6666ÊFax (02) 9477 6655
INFO: www.helixsoftware.com
A surge suppressor
Power corrupts, and electric power corrupts electrically. You can install a good surge
suppressor but a sudden jolt of electricity can wipe out the computer. This is especially
true for users in an area with frequent electrical storms or a building with ancient
wiring. Sure, it's a small risk, but do you want to bet all of your hard work on it? A
surge suppressor looks and works like a power strip, but it also protects the devices
plugged into it from electrical surges that can fry your hardware. If the surge suppressor
is hit by a bigger jolt than it can handle, it will self-destruct, shutting off power to
the computer and sacrificing itself for the good of more expensive hardware. Best of all,
there's quite a range to stock.
A backup drive
End users guide to backup security
1) Get a tape drive. The easiest backup in the world is one where you click a button and
walk away. The cheapest way to do this is to buy a tape backup device with at least the
capacity of your hard drive, such as the Iomega Ditto 2Gb or the Ditto Easy 3200.
Removable-disk units like Iomega's Zip drive just aren't big enough. And don't even think
about floppies.
Internal backup drives are cheaper than external drives but require you to pop the hood
and install them; most external drives are slower but simply plug into your PC's parallel
port.
2) Test it. Sometimes a tape drive seems to be working fine. Then you try to restore a
file (usually the day your big project is due) and realise the drive only looks like it's
been backing up your files. To protect yourself, back up and restore a few files when you
set up the drive's software. You should run this test about once a month.
3) Set up a schedule. Once you've tested the tape drive, do a full backup. Then at the end
of every workday, do an incremental backup, which copies only those files that were
created or have changed since the last backup.
Every two weeks, change tapes and do another full backup. With two or three tapes, you'll
have a month's worth of data.
4) Store your data off-site. If your computer is stolen or destroyed in a fire, you don't
want your data to go with it.
AN UNINTERRUPTIBLE POWER SUPPLY
If work is so critical and time-sensitive that a sudden system crash would be an absolute
disaster, install yourself an uninterruptible power supply. This is basically a battery
operated 240v supply with a surge suppressor attached. If a power failure occurs, you'll
get enough juice to save your work and close things down gracefully.
Not long ago, UPS was an irrelevant issue for Australian businesses, since State- and
Commonwealth-operated electricity companies provided a high quality, rarely interrupted
source of power. With the growth in privatisation of these resources, both the quality and
reliability of electricity have allegedly declined.
According to John Pignolet, from Australian UPS manufacturer PowerTech, uninterruptible
power supplies perform filtering and "power conditioning" on the mains power.
This is the act of removing surges, sags, spikes, electrical noise and other impurities
from the mains power supply. The effects of these phenomena are most visible to humans in
the effects they have on incandescent lights (dimming, flickers, flashing).
The effects they have on computers are very serious, and can result in stress on computer
components, causing premature hardware failures, or corruption of RAM, cache or hard drive
data.
ARN is featuring UPS's in the November 5 issu
[Copyright 1997,1998, IDG Communications Pty Ltd. All rights reserved.] |