15.8.1 Process availability and data availability

IBM

15.8.1 Process availability and data availability

Process availability is achieved by multiple processes of an application, such as WebSphere workload management using server clusters, multiple Web servers, multiple database instance processes, multiple firewall processes, and multiple LDAP server processes. Usually, clients find the available process using 1-to-n mapping, redirection (IP spraying), or transparent IP takeover.

Data availability is achieved by replication or clustering. When data is shared by multiple processes, data integrity should be ensured by distributed lock management. Data is either stored in memory or on disk. For in-memory or local data, we need to maintain client request affinity to access the same data. It is very important to make sure that data inconsistencies be corrected before any process uses data, because a failed process can damage data integrity.

Depending on data change and access frequencies, we have different approaches to achieve data high availability: - Type I. Static data

There are no changes for a period of months. An example is software install binaries. This static data is usually placed in individual hosts. For convenience of management, it can also be placed in shared disks or file systems. - Type II. Rarely changing data with planned change time (change period: several hours to several days)

Examples are firewall configuration files, Web server configuration files, WebSphere configuration files, or HTTP static files. You can copy these files to different nodes (replication). However, an HA file system can help to minimize your administration burden. If, for example, you have 10 Web servers and copy HTTP files to 10 Web servers every day (assuming that you change Web pages once a day), content management software could be used to reduce the administrative work involved in managing this data. - Type III. Rarely changing data with unplanned change time

An examples is LDAP data. Clustering or replication can be used for high availability. - Type IV. Active data with frequent accesses and frequent changes

Examples are Entity EJBs data, session data, and application data.

Data high availability is more difficult to implement than process high availability. Most importantly, data high availability and process high availability are both needed to complete the task. Data high availability is essential in most applications. It does not make any sense for a business to run a process if that process cannot access required data. For example, there is little value in running a stock brokerage system if stock data and trading session data are unavailable. It does not help to have the WebSphere EJB container process available if Entity EJBs cannot access their data states, or to have the WebSphere Web container process if servlets cannot access needed HTTP session states.

We have discussed aspects and techniques for building an end-to-end highly available WebSphere production system. Through these techniques, we can achieve both data high availability and process high availability for the WebSphere system.

ibm.com/redbooks