Replicating SIP sessions
We can set up a replication domain for SIP sessions if we want replication of session, and dialog state information to occur during typical Session Initiation Protocol (SIP) processing. The SIP container typically uses the Data Replication Service (DRS) to replicate all state information. Because DRS does not provide a way to confirm when data replication has completed, the only thing that can be quantified is when a piece of state information is queued for replication. Within this topic, references to replication of data only means that the data has been queued for replication.
The SIP container replicates several different types of information. This data falls into two general categories:
- Internal SIP container state information associated with the dialog.
- Application state information associated with the various session objects.
Each of these categories includes a number of different data types, described later in this topic. Each data object is treated independently. Therefore, a change to an application session object, that causes a replication, does not result in the replication of any internal state information.
Replication of internal state information
Internal state information can be defined as anything related to the state of a dialog being handled by the container. It includes things like cseq, dialog state (initial, early, confirmed, terminated), session expiration, local, remote party, etc. Internal state information is only replicated due to the existence of a dialog. Therefore, no internal SIP-related data will be replicated until the dialog with which it is associated is established. The types of SIP requests that can cause the creation of a dialog include:
- INVITE
- SUBSCRIBE
- REFER
Replication of internal state happens at well-defined, predictable points in the call flow. For example, a dialog is only established at the container when a 2xx response or a 1xx response with a "To" tag is received or sent due to one of the method types listed previously. Events that can trigger an internal state replication include:
- Creation of a new SIP dialog
- Expiration of a session due to a session timeout
- The sending of a final response to a UAC
- Creation of an encoded URI
- Handling of any message that results in a change of the internal dialog state
It is important to note that transaction state is not replicated in the WAS 6.1 version of the SIP container, only dialog state. Not replicating transaction state reduces the load on all the servers in the replication domain, but can cause problems in failures that happen in the middle of a transaction (for example, loss of some dialog-related SIP messages).
An important difference between a B2BUA and a proxy application is the number of session objects created and replicated. In both cases only a single application session is created, but for the B2BUA, two session objects are created-one for the inbound leg and one for the outbound leg. For a proxy application, only a single session object is created.
Replication of application state information
Application state information is treated differently from internal dialog state information because it does not rely on the existence of a dialog to be replicated. Application state refers to any data that is being maintained by the application through the use of JSR 116 data constructs. This includes:
- javax.servlet.sip.SipApplicationSession
- javax.servlet.sip.SipSession
- javax.servlet.sip.ServletTimer
- Any attribute set on the SipSession or the SipApplicationSession
Replication of internal state happens at well-defined, predictable points in the call flow, while replication of application state is less predictable because it is generally dependent on when an application creates, invalidates or modifies the session data and timers through JSR 116 APIs. This could be due to the processing of an inbound message, or to the expiration of a SIP timer. All of the following can cause the replication of application-related session data:
- Creation of an application session object
- Creation of a SIP session object
- Creation of a SIP session timer
- Modification of a session object through setAttribute or removeAttribute
- Invalidation of a SIP session object
- Expiration of a session timer
- Application code sending out a request 1
Replication can occur for requests that do not establish a dialog if an application calls request.getApplicationSession(true) and if addTimer() and/or addAttribute() are called on the resulting application session object. This is needed so that a listener can be called when the timer expires.
SIP failover and replication setup considerations
A SIP cluster that requires replication and failover can consist of many replication domains, each of which contain a set of SIP containers. There is no limit set on the number of containers in a cluster. For performance reasons, each replication domain should contain only two containers.
The replication domain should be set to the Entire Domain, which means state is replicated to all containers in the replication domain. The replication mode should be Both client and server.
The distributed session for a container needs to be set to Memory-to-memory replication. Any applications that require session replication must include the <distributable /> tag in the web.xml and sip.xml files. Both SIP and HTTP will utilize the same replication topology
(zos) Make sure that the system is configured such that:
- Your system includes at least 2 controllers, and each controller has at least 2 servants for failover and recovery purposes.
- The total workload, which includes normal operation calls, failover calls, and recovery calls, must never exceed the maximum call rate.
To achieve the maximum number of calls per second, the following configuration settings are required:
- We are running on a z/OS Version 1 Release 10 system with High Performance FICON for System z (zHPF) enabled for fast I/O. A DS8000 DASD must also be running on this system.
- The log stream and the staging data set size definitions used when creating the log streams must be equal to or greater than 256 Megabytes (LS_SIZE(64000) STG_SIZE(64000)).
We can determine the maximum call rate by progressively increasing the call rates over a period of time and monitoring performance until tolerable timeout rates are achieved.
- The total amount of data propagated over all active calls does note exceed 2 GB.
- All nodes are at the Version 7.0.0.7 or later level before you enable SIP or Communications Enabled Applications (CEA) work on z/OS.
- Other services that require data persistence, such as transactions, and compensations are configured to use HFS file logging if you plan on using the system for SIP or CEA related work. After you configure your system for SIP or CEA related work, other services that require data persistence cannot use log streams.
(zos) We must also delete and re-create the log stream for a server if any of the following situations occur:
- All of the controllers within a cluster fail at the same time. In this situation, some of the data needed to perform the recovery is lost because of the multiple concurrent failures.
- A failure occurs during the recovery process. In this situation, the log stream associated with the failed sessions contains incomplete data.
SIP session replication topology
- Each member replicates all state data to every peer in its replication domain.
- Each replication domain should ideally contain two servers.
- When a failure occurs, the core group coordinator tells the remaining core group members which replicated sessions to activate. These replicated sessions then become part of their active sessions.
Set up a replication domain for SIP sessions.
- In the console, click Environment > Replication domains > New
- Click Number of replicas, and then select Entire domain.
- In the Container settings section, click Session management.
- In the Additional Properties section, click Distributed environment settings, and then click Memory-to-memory replication.
- Set Replication mode toBoth client and server.
- Save the changes.
Results
Memory-to-memory replication is enabled for SIP sessions.
Related concepts
SIP session affinity and failover
Related tasks
Browse all SIP topics 1 Causes replication of the SipSession and SipApplicationSession in order to synchronize the "last access timestamp" with the peer containers in the cluster. This is for the integrity of future calls to SipSession.getLastAccessedTime(), and SipApplicationSession.getLastAccessedTime()