Causes of downtime

The causes of downtime can be either planned events or unplanned events. As shown in Table 8-2, planned events account for as much as 30% of downtime. As we have discussed above, rolling upgrade and hot replacement can reduce the planned downtime. However, the most important issue is how to minimize the unplanned downtime, since nobody knows when the unplanned downtime occurs and all businesses require the system to be up during business hours. From Table 8-2, we can see that software failures are responsible for 40% of system downtime. Software failures include network software failure, server software failure, and client software failure.

Table 8-2
Cause of downtime Percentage
Software failures 40%
Hardware failures 10%
Human errors 15%
Environmental problems 5%
Planned downtime 30%

Causes of downtime

The human error factor is also a major cause of downtime. Although education is important, it is also important or perhaps even more important to design easy-to-use system management facilities.

The end-to-end WebSphere high availability system that eliminates a single point of failure (SPOF) for all parts of the system can minimize both planned and unplanned downtime. We will describe the implementation of such a WebSphere high availability system.

Many environmental problems are data center related. Having a locally located standby does not suffice, because the entire site environment may be affected. Geographic clustering and data replication can minimize downtime caused by such environmental problems.

  Prev | Home | Next

 

WebSphere is a trademark of the IBM Corporation in the United States, other countries, or both.

 

IBM is a trademark of the IBM Corporation in the United States, other countries, or both.