|
|
ATON Computing’s Computer Disaster Avoidance
Program is designed to cost-effectively reduce the
probability of business disruption due to computer
failure and provide a recovery plan to minimize the
impact if such a failure occurs.
The Computer Disaster Avoidance Program (CDAP) is a consultative process which
will provide the client with a sense of security, knowing that everything possible
has been done to prevent computer failure. CDAP is a scalable product, applicable
to all organizations that depend on their computers, regardless of size. CDAP
utilizes a step-by-step, analytical approach to data gathering and analysis. Each
piece of information, each assumption, each conclusion, is documented for review by
management.
|
|
The Computer Disaster Avoidance Program is comprised of five phases, including:
-
Disaster Definition & Risk Assessment
The company’s business functions will be reviewed, computer systems required to
support those functions will be catalogued and potential causes of disruption will
be evaluated for severity and probability of occurrence.
-
Identify/Implement Computer Processing Safeguards
Recommendations will be made regarding existing and future safeguards designed to
reduce the company’s exposure to potential business disruptions.
-
Develop Network Recovery Plan
A Disaster Recovery Plan will be developed for each level of business disruption
severity, outlining the steps required to restore operations.
-
Plan Testing & Updating
A schedule for testing the Disaster Recovery Plan and a method of updating the
Plan’s details will be implemented.
-
Monthly System Maintenance
Monthly System Maintenance procedures will be developed including scheduled
backups, off-site storage of backup media, updating anti-virus & operating system
software and user security review.
Most people think of Disaster Recovery Planning when the topic of computer problems
is discussed. They believe that making a daily backup and knowing how to restore the
tape is all they need to do. Unfortunately, case studies have proven that a company
needs to do more than just make a tape backup if they are to survive a computer failure.
Development of a comprehensive CDAP is a multiphase project that requires
significant effort on the part of the planner and the overt support of upper
management. The following is an outline of the process necessary to build a CDAP
which will act as the core of a Business Continuation Plan in case of an emergency.
- Disaster Definition & Risk Assessment
- Define Business Functions
- Network Configuration
- Workstations & Peripherals
- Computer Software Inventory
- Conduct Risk Assessment
- Identify Disaster Prevention Capabilities
- Develop Network Recovery Plan
- Train Recovery Teams
- Test and Maintain Plan
- Develop and Implement Periodic Maintenance
Each results of each function listed above are a combination of factual data,
client assumptions/goals, consultant knowledge/experience and a dose of reality.
Each phase will be fully documented, using forms and methods developed by ATON as
described below.
DATA COLLECTION
The data collection process encompasses a wide range of information covering all
aspects of the company’s business, including business functions performed, computer
hardware used and computer software installed.
Data Collection - Business Functions Defined
Each business function must be defined as an individual entity. The definition
must include:
- What is the function
- Who is responsible for the function
- Who does it
- What computer hardware/software is necessary to perform the function
- What are the results of the function
- Is the function seasonal
- How critical is the function
- What is the potential loss if the function cannot be performed for 24-, 48- or 72-hours
- What are the existing contingency plans
- Is there insurance coverage for this function
The result of the functional analysis is a comprehensive picture of the company’s
operations.
Data Collection - Network Configuration
If there is a LAN or WAN in place, the planner should define the network in
complete detail. The details include:
- What make is server
- How much memory
- What size harddisk(s)
- What monitor
- What level SFT
- What ports are installed
- Tape drive
- CD
- Other hardware
- What manufacturer of network software
- What version
- How many users
- Does installation software exist
- Wiring configuration
- Printer configuration
In addition to the Network Server, any print or communication servers should be
documented fully.
Data Collection - Workstations & Peripherals
All computer hardware other than the network server(s) should be inventoried.
Inventory information should include:
- Make
- Model
- Configuration
- Modems
- Operating system - does OS installation software exist
- Printer Drivers - does driver installation software exist
Data Collection - Computer Software
All software identified in the business function definition process must be
inventoried in the event it must be re-installed. The software inventory should include:
- Version
- Installation location
- Does installation software exist
- Special requirements or configuration
CONDUCT RISK ANALYSIS
The Risk Analysis function is the most subjective of all CDAP phases. Disasters
can come in many forms -- natural disasters such as earthquakes, hurricanes, fires
or floods; manmade disasters such as power outages, internal vandalism by a
disgruntled employee or external vandalism such as a computer virus; or hardware failure.
Of all these potential disaster types, only hardware failure can be calculated
with mathmatical certitude. Each components’ manufacturer does extensive testing
to determine the MTBF (mean time between failure) of his product. By analyzing
the MTBF of all components in a given computer system, it is possible to compute
the statistical probability of computer failure. In all other instances, it is
impossible to know what one’s actual risk is.
In light of this lack of statistics, the only logical method is to review the
various dangers, assess the potential disasters by empirical means, calculate the
cost of recovery from each and apply a criticality scale to rank them.
IDENTIFY DISASTER PREVENTION CAPABILITIES
Once the risks are known, disaster prevention receives the planners focus.
Areas of concern are:
- Fire Prevention
- Flood Detection (if applicable)
- Uninterruptible Power requirements
- System security
- Develop a System Backup/Off-site Storage Program
DEVELOP NETWORK RECOVERY PLAN
The Network Recovery Plan is the recipe for restoring critical business functions,
utilizing minimum system resources in the least amount of time possible.
The Network Recovery Plan (NRP) should provide a detailed response to any
disruption in computer services, ranging from a virus on a single workstation to the
complete destruction of the network. There should be a decision matrix compiled
to help management determine what responses are applicable for a given situation.
For each disaster, the following should be defined:
- Minimum service requirements
- Critical applications
- Required hardware
- Required communications
- Required Peripherals
- Location/Availability of software
- Necessary Personnel
- Time required to restore service
- Steps necessary to implement recovery
- Recovery costs
- Alternatives (use different workstation; different location)
TRAIN RECOVERY TEAMS
Once a Network Recovery Plan has been developed, it is necessary to instruct
company personnel on their responsibilities during the recovery process.
TEST AND MAINTAIN PLAN
A Network Recovery Plan is of no value if it is not tested and updated on a
regular basis. Every computer component that changes, every software application
that is upgraded, every personnel change has a direct impact on the success of
the NRP. Each change must be documented in the CDAP and procedures need to be
modified to accommodate those changes.
MONTHLY SYSTEM MAINTENANCE
Monthly System Maintenance procedures will be developed to minimize the potential
for a computer failure and to generate the tools necessary to recover should such a
failure occur. Each client’s maintenance needs differ but a partial list of tasks
includes:
- Update virus definitions on all workstations
- Update all workstation/server operating systems when appropriate
- Review server logs and note any difficulties
- Review backup logs and randomly sample backup tapes
- Store a copy of the system backup at our facility
- Update hardware/software inventory
- Answer any questions pertaining to the installed hardware/software
- Provide the client with written documentation of the work done
|