Possible single points of failure in the WebSphere system

Table 8-3 lists potential single points of failure in the WebSphere system and possible solutions.

Table 8-3
Failure point Possible solutions Note
Entity EJBs, Application DB HA DBs, parallel DBs Catch StaleConnectionException and retry
Log files HA DBs, parallel DBs, clustering, HA files system, disk mirroring, RAID-5  
LDAP HA LDAP, master-replica, clustering, sprayer Manual recovery
WebSphere Deployment Manager Multiple WebSphere domains (cells), OS-service, hardware-based clustering  
WebSphere Master Repository Data HA shared file system, Networked file system, hardware based clustering  
WebSphere Node Agent Multiple Node Agents with WebSphere built-in HA LSD, OS-service, hardware-based clustering  
WAS EJB WLM, servlet clustering, hardware-based clustering Use different appservers/nodes for EJBs and servlets
Web server Multiple Web servers with network sprayer, hardware-based clustering  
Load balancer HA network sprayers  
Firewall Firewall clustering, HA firewall, firewall sprayer  
Internal network Dual internal networks  
Hubs Multiple interconnected network paths  
NIC failures Multiple NICs in a host  
Disk failures Disk mirroring, RAID-5  
Disk bus failure Multiple buses  
Disk controller failure Multiple disk controllers  
Network service failures (DNS, ARP, DHCP, etc.) Multiple DNS, etc. network services  
OS or other software crashes Clustering, automatically switch to a healthy node  
Host dies Clustering, automatically switch to a healthy node  
Power outages Two-power systems  
Room disaster (fire, flood, etc.) Put the system in a different room  
Floor disasters (fire, flood, etc.) Put the system on a different floor  
Building disasters (fire, flood, tornado, etc.) Put the system in a different building  
City disasters (earthquake, flood, etc.) Remote mirror, replication, geographical clustering  
Region disasters Put two data centers far away with geographical clustering or remote mirroring  
People error Train people, simplify system management, use clustering and redundant hardware/software, disable DDL operations  
Software upgrades Rolling upgrades with clustering and/or WLM for 7x24x365, planned maintenance for others For 7x24x365, suggest using two-domain WAS; single domain with WLM works, but is not suggested.
Hardware upgrades Rolling upgrades with clustering and/or WLM for 7x24x365, planned maintenance for others  

Possible SPOF in the WebSphere system

  Prev | Home | Next

 

WebSphere is a trademark of the IBM Corporation in the United States, other countries, or both.

 

IBM is a trademark of the IBM Corporation in the United States, other countries, or both.