Using Clusters

      

Whole Server Migration

The following sections describe the different migration mechanisms supported by WebLogic Server:

These sections focus on whole server-level migration, where a migratable server instance, and all of its services, is migrated to a different physical machine upon failure. WebLogic Server also supports service-level migration, as well as replication and failover at the application level. For more information, see Service Migration and Failover and Replication in a Cluster.

 


Understanding Server and Service Migration

Whole server migration is not supported on all platforms. See “Support for Server Migration” in Supported Configurations for WebLogic Platform 10.3.

In a WebLogic Server cluster, most services are deployed homogeneously on all server instances in the cluster, enabling transparent failover from one server to another. In contrast, “pinned services” such as JMS and the JTA transaction recovery system are targeted at individual server instances within a cluster—for these services, WebLogic Server supports failure recovery with migration, as opposed to failover.

Migration in WebLogic Server is the process of moving a clustered WebLogic Server instance or a component running on a clustered instance elsewhere in the event of failure. In the case of whole server migration, the server instance is migrated to a different physical machine upon failure. In the case of service-level migration, the services are moved to a different server instance within the cluster. See Service Migration.

WebLogic Server provides a feature for making JMS and the JTA transaction system highly available: migratable servers. Migratable servers provide for both automatic and manual migration at the server-level, rather than the service level.

When a migratable server becomes unavailable for any reason, for example, if it hangs, loses network connectivity, or its host machine fails—migration is automatic. Upon failure, a migratable server is automatically restarted on the same machine if possible. If the migratable server cannot be restarted on the machine where it failed, it is migrated to another machine. In addition, an administrator can manually initiate migration of a server instance.

 


Migration Terminology

The following terms apply to server and service migration:

 


Leasing

Leasing is the process WebLogic Server uses to manage services that are required to run on only one member of a cluster at a time. Leasing ensures exclusive ownership of a cluster-wide entity. Within a cluster, there is a single owner of a lease. Additionally, leases can failover in case of server or cluster failure. This helps to avoid having a single point of failure.

 

Features That Use Leasing

The following WebLogic server features use leasing:

Beyond basic configuration, most leasing functionality is handled internally by WebLogic Server.

 

Leasing Versions

WebLogic Server provides two separate implementations of the leasing functionality. Which one you use depends on your requirements and your environment.

Within a WebLogic Server installation, you can only use one type of leasing. Although it is possible to implement multiple features that use leasing within your environment, each must use the same kind of leasing.

When switching from one leasing type to another, restart the entire cluster, not just the Administration Server. Changing the leasing type cannot be done dynamically.

 

Determining Which Type of Leasing To Use

The following considerations will help you determine which type of leasing is appropriate to your WebLogic Server environment:

 

High-availability Database Leasing

In this version of leasing, lease information is maintained within a table in a high-availability database. A high-availability database is required to ensure that leasing information is always available. Each member of the cluster must be able to connect to the database in order to access leasing information.

This method of leasing is useful for customers who already have a high-availability database within their clustered environment. This method allows you to utilize leasing functionality without being required to use Node Manager to manage servers within your environment.

The following procedures outline the steps required to configure your database for leasing.

  1. Configure the database for server migration. This information that is used to determine whether or not a server is running or needs to be migrated. For more information on leasing, see Leasing.

    Your database must be reliable. The server instances will only be as reliable as the database is. For experimental purposes, a normal database will suffice. For a production environment, only high-availability databases are recommended. If the database goes down, all the migratable servers will shut themselves down.

    Create the leasing table in the database. This is used to store the machine-server associations used to enable server migration. The schema for this table is located in:

    WL_HOME/server/db/dbname/leasing.ddl
    

    where dbname is the name of the database vendor.

    The leasing table should be stored in a highly available database. Migratable servers are only as reliable as the database used to store the leasing table.

  2. Set up and configure a data source. This data source should point to the database configured in the previous step.

    XA data sources are not supported for server migration.

    For more information on creating a JDBC data source, see “Configuring JDBC Data Sources” in Configure WebLogic JDBC.

 

Non-database Consensus Leasing

In the non-database version of Consensus leasing, WebLogic Server maintains leasing information in-memory. This removes the requirement of having a high-availability database to use features that require leasing.

One member of a cluster is chosen as the cluster leader and is responsible for maintaining the leasing information. The cluster leader is chosen based on the length of time that has passed since startup. The managed server that has been running the longest within a cluster is chosen as the cluster leader. Other cluster members communicate with this server to determine leasing information, however, the leasing table is replicated to other nodes of the cluster to provide failover.

This version of leasing requires that you use Node Manager to control servers within the cluster. Node Manager should also be running on every machine hosting managed servers within the cluster. For more information, see “Using Node Manager to Control Servers” in Node Manager Administrator's Guide.

 


Automatic Whole Server Migration

This section outlines the procedures for configuring automatic whole server migration and provides a general discussion of how whole server migration functions within a WebLogic Server environment.

The following topics are covered:

 

Preparing for Automatic Whole Server Migration

Before configuring automatic whole server migration, be aware of the following requirements:

 

Configuring Automatic Whole Server Migration

Before configuring server migration, ensure that your environment meets the requirements outlined in Preparing for Automatic Whole Server Migration.

To configure server migration for a Managed Server within a cluster, perform the following tasks:

  1. Obtain floating IP addresses for each Managed Server that will have migration enabled.

    Each migratable server must be assigned a floating IP address which follows the server from one physical machine to another after migration. Any server that is assigned a floating IP address must also have AutoMigrationEnabled set to true.

    The migratable IP address should not be present on the interface of any of the candidate machines before the migratable server is started.

  2. Configure Node Manager. Node Manager must be running and configured to allow server migration.

    The Java version of Node Manager can be used for server migration on Windows or UNIX. The SSH version of Node Manager can be used for server migration on UNIX only.

    When using the Java Node Manager, edit nodemanager.properties at WL_HOME/common/nodemanager/ to add your environment's Interface and NetMask values. For information about nodemanager.properties, see “Reviewing nodemanager.properties” in Node Manager Administrator's Guide.

    If you are using the SSH version of Node Manager, edit wlscontrol.sh and set the Interface variable to the name of your network interface.

    For general information on using Node Manager in server migration, see Node Manager's Role in Whole Server Migration. For general information on configuring Node Manager, “General Node Manager Configuration” in Node Manager Administration Guide.

  3. If you are using a database to manage leasing information, configure the database for server migration according to the procedures outlined in High-availability Database Leasing. For general information on leasing, see Leasing.

  4. If you are using database leasing within a test environment and you need to reset the leasing table, you should re-run the leasing.ddl script. This causes the correct tables to be dropped and re-created.

  5. If you are using a database to store leasing information, set up and configure a data source according to the procedures outlined in High-availability Database Leasing.

    You should set DataSourceForAutomaticMigration to this data source in each cluster configuration.

    XA data sources are not supported for server migration.

    For more information on creating a JDBC data source, see “Configuring JDBC Data Sources” in Configure WebLogic JDBC.

  6. Grant superuser privileges to the wlsifconfig.sh script (on UNIX) or the wlsifconfig.cmd script (on Windows).

    This script is used to transfer IP addresses from one machine to another during migration. It must be able to run ifconfig, which is generally only available to superusers. You can edit the script so that it is invoked using sudo.

    The Java Node Manager uses the wlsifconfig.cmd script, which uses the netsh utility.

    The wlsifconfig scripts are available in the WL_HOME/common/bin directory.

  7. Ensure that the following commands are included in your machines' PATH:

    • wlsifconfig.sh (UNIX) or wlsifconfig.cmd (Windows)

    • wlscontrol.sh (UNIX)

    • nodemanager.domains

      The wlsifconfig.sh, wlsifconfig.cmd, and wlscontrol.sh files are located in WL_HOME/common/bin. The nodemanager.domains file is located in WL_HOME/common/nodemanager.

      Depending on your default shell on UNIX, you may need to edit the first line of the .sh scripts.

  8. This step applies only to UNIX. If you are using Windows, skip to step 9.

    The machines that host migratable servers must trust each other. For server migration to occur, it must be possible to get to a shell prompt using 'ssh/rsh machine_A' from machine_B and vice versa without having to explicitly enter a username/password. Also, each machine must be able to connect to itself using SSH in the same way.

    You should ensure that your login scripts (.cshrc, .profile, .login, etc.) only echo messages from your shell profile if the shell is interactive. WebLogic Server uses an ssh command to login and echo the contents of the server.state file. Only the first line of this output is used to determine the server state.

  9. Set the candidate machines for server migration. Each server can have a different set of Candidate machines, or they can all have the same set.

  10. Restart the admin server.

 

Using High Availability Storage for State Data

The server migration process migrates services, but not the state information associated with work in process at the time of failure.

To ensure high availability, it is critical that such state information remains available to the server instance and the services it hosts after migration. Otherwise, data about the work in process at the time of failure may be lost. State information maintained by a migratable server, such as the data contained in transaction logs, should be stored in a shared storage system that is accessible to any potential machine to which a failed migratable server might be migrated. For highest reliability, use a shared storage solution that is itself highly available—for example, a storage area network (SAN).

In addition, if you are using a database to store leasing information, the lease table, described in the following sections, which is used to track the health and liveness of migratable servers should also stored in a high availability database. For more information, see Leasing.

 

Server Migration Processes and Communications

The sections that follow describe key processes in a cluster that contains migratable servers:

Startup Process in a Cluster with Migratable Servers

Figure 7-1 illustrates the processing and communications that occur during startup of a cluster that contains migratable servers.

The example cluster contains two Managed Servers, both of which are migratable. The Administration Server and the two Managed Servers each run on different machines. A fourth machine is available as a backup—in the event that one of the migratable servers fails. Node Manager is running on the backup machine and on each machine with a running migratable server. Figure 7-1 Startup of Cluster with Migratable Servers

Startup of Cluster with Migratable Servers

These are the key steps that occur during startup of the cluster illustrated in Figure 7-1:

  1. The administrator starts up the cluster.

  2. The Administration Server invokes Node Manager on Machines B and C to start Managed Servers 1 and 2, respectively. See Administration Server's Role in Whole Server Migration.

  3. The Node Manager on each machine starts up the Managed Server that runs there. See Node Manager's Role in Whole Server Migration.

  4. Managed Servers 1 and 2 contact the Administration Server for their configuration. See Migratable Server Behavior in a Cluster.

  5. Managed Servers 1 and 2 cache the configuration they started up.

  6. Managed Servers 1 and 2 each obtain a migratable server lease in the lease table. Because Managed Server 1 starts up first, it also obtains a cluster master lease. See Cluster Master's Role in Whole Server Migration.

  7. Managed Server 1 and 2 periodically renew their leases in the lease table, proving their health and liveness.

Automatic Whole Server Migration Process

Figure 7-2 illustrates the automatic migration process after the failure of the machine hosting Managed Server 2. Figure 7-2 Automatic Migration of a Failed Server

Automatic Migration of a Failed Server

  1. Machine C, which hosts Managed Server 2, fails.

  2. Upon its next periodic review of the lease table, the cluster master detects that Managed Server 2's lease has expired. See Cluster Master's Role in Whole Server Migration.

  3. The cluster master tries to contact Node Manager on Machine C to restart Managed Server 2, but fails, because Machine C is unreachable.

    If the Managed Server 2's lease had expired because it was hung, and Machine C was reachable, the cluster master would use Node Manager to restart Managed Server 2 on Machine C.

  4. The cluster master contacts Node Manager on Machine D, which is configured as an available host for migratable servers in the cluster.

  5. Node Manager on Machine D starts Managed Server 2. See Node Manager's Role in Whole Server Migration.

  6. Managed Server 2 starts up and contacts the Administration Server to obtain its configuration.

  7. Managed Server 2 caches the configuration it started up with.

  8. Managed Server 2 obtains a migratable server lease.

During migration, the clients of the Managed Server that is migrating may experience a brief interruption in service; it may be necessary to reconnect. On Solaris and Linux operating systems, this can be done using ifconfig command. The clients of a migrated server do not need to know the particular machine to which it has migrated.

When a machine that previously hosted a server instance that was migrated becomes available again, the reversal of the migration process—migrating the server instance back to its original host machine—is known as failback. WebLogic Server does not automate the process of failback. An administrator can accomplish failback by manually restoring the server instance to its original host.

The general procedures for restoring a server to its original host are as follows:

The exact procedures you will follow depend on your server and network environment.

Manual Whole Server Migration Process

Figure 7-3 illustrates what happens when an administrator manually migrates a migratable server. Figure 7-3 Manual Whole Server Migration

Manual Whole Server Migration

  1. An administrator uses the Administration Console to initiate the migration of Managed Server 2 from Machine C to Machine B.

  2. The Administration Server contacts Node Manager on Machine C. See Administration Server's Role in Whole Server Migration.

  3. Node Manager on Machine C stops Managed Server 2.

  4. Managed Server 2 removes its row from the lease table.

  5. The Administration Server invokes Node Manager on Machine B.

  6. Node Manager on Machine B starts Managed Server 2.

  7. Managed Server 2 obtains its configuration from the Administration Server.

  8. Managed Server 2 caches the configuration it started up with.

  9. Managed Server 2 adds a row to the lease table.

Administration Server's Role in Whole Server Migration

In a cluster that contains migratable servers, the Administration Server:

In addition, the Administration Server provides its regular domain management functionality, persisting configuration updates issued by an administrator, and providing a run-time view of the domain, including the migratable servers it contains.

Migratable Server Behavior in a Cluster

A migratable server is a clustered Managed Server that has been configured as migratable. These are the key behaviors of a migratable server:

Node Manager's Role in Whole Server Migration

The use of Node Manager is required for server migration—it must run on each machine that hosts, or is intended to host.

Node Manager supports server migration in these ways:

Cluster Master's Role in Whole Server Migration

In a cluster that contains migratable servers, one server instance acts as the cluster master. Its role is to orchestrate the server migration process. Any server instance in the cluster can serve as the cluster master. When you start a cluster that contains migratable servers, the first server to join the cluster becomes the cluster master and starts up the cluster manager service. If a cluster does not include at least one migratable server, it does not require a cluster master, and the cluster master service does not start up. In the absence of a cluster master, migratable servers can continue to operate, but server migration is not possible. These are the key functions of the cluster master: