Troubleshoot SIP applications
Use this topic to troubleshoot SIP applications.
Use the following SIP container troubleshooting basics when troubleshooting problems with SIP applications.
- The Average CPU usage of the system should go no higher than 60%-70%.
- The container should use no more than 70% of the allocated VM heap size. Be sure the system has enough physical memory to accommodate the VM heap size. Call loads and session timeouts will have a big affect on heap usage.
- The maximum garbage collection (GC) time of the VM on which the container is running should not exceed 500 ms and the average should be less than 400 ms. Verbose GC can be used to measure this and the PMI viewer can be used to view GC times, heap usage, active sessions, etc., in graphical form.
Tasks
Use the following troubleshooting checklist to troubleshooting problems with SIP applications:
- Check the listening ports in the configuration
- Use netstat -an to see listening ports.
- Check to see if virtual hosts are defined
- Check to see if host aliases are defined.
- Is an application installed? Is it started
- For a proxy configuration: Is a default cluster configured? If proxy and server are on the machine, is there a port conflict?
If the problem is not resolved, check for specific symptoms using the following SIP container symptoms and solutions:
- Symptom: Lots of retransmissions, CPU periodically drops to zero, or a ThreadPoolQueueIsFullException exception is seen in the logs
Solution: This is typically a DNS issue caused by Reverse DNS lookups and can be confirmed using a tool like Ethereal. If we do a network capture and send lots of DNS queries that contain an IP address and get back a host name in the response, this could be the problem. Make sure that nscd is running if we are on HP or some other platform that requires name service caching. The Microsoft Windows platform does not require name service caching. Another solution is to add host names to the /etc/hosts file.
- Symptom: Lots of retransmissions, CPU periodically spikes to 100%.
Solution: This is typically due to garbage collection and can be verified by turning on verbose GC (accessible on the admin console) and looking at the length of the GC cycles. The solution here is to enable Generational Garbage Collection by setting the JVM optional args to -Xgcpolicy:gencon.
- Symptom: Lots of retransmissions, CPU spikes to 100% for long periods of time and Generational Garbage Collection is enabled.
Solution: This is typically due to SIP session objects either not being invalidated or not timing out over a long period of time. One solution is to set the session timeout value in the sip.xml of the application to a smaller value. The most efficient way to handle this is for the application to call invalidate on the session when the dialog completes (i.e. after receiving a BYE). The following entry in the SystemOut.log file will indicate the session timeout value is for each application installed on the container:
SipXMLParser 3 SipXMLParser getAppSessionTTL Setting Expiration time: 7 Minutes, For App: TCK back-to-back user agent"IBM recommends using the High Performance Extensible Logging (HPEL) log and trace infrastructure . We view HPEL log and trace information using the logViewer .
- Symptom: Lots of 480 Service Not Available messages received from the container when sending new INVITE messages to the SIP container. You will also likely see the following message show up in the SystemOut.log when the server is in this state: LoadManager E LoadManager warn.server.oveloaded.
Solution: This is typically due to one of the SIP container configurable metrics being exceeded. This includes the Maximum Application Sessions value and the Maximum messages per averaging period value. The solution is to adjust these values higher.
- Symptom: Lots of resends and calls are not completing accompanied by OutOfMemory exceptions in the SystemErr.log.
Solution: This usually means that the VM heap size associated with your container is not large enough and should be adjusted upwards. We can adjust this value from the admin console.
- Symptom: You receive a 503 Service Unavailable when sending a SIP request to a SIP proxy.
Solution: This usually means there is no default cluster (or cluster routing rule that matches the message) set up at the proxy. This can also happen when the SIP proxy is configured well but the backend SIP containers are stopped or have crashed.
- Symptom: You receive a 404 Not Found when sending a SIP request to a SIP proxy.
Solution: This usually means there is no virtual host set up for the containers that reside in the default cluster. It could also mean that the servers in the proxy's default cluster do not contain a SIP application or that the message does not match one of the applications installed in the default cluster.
- Symptom: An out of memory behavior is occurring.
Solution: This may be due to the maximum heap size being set too low. SIP applications can consume a significant amount of memory because the sessions exist for a long call hold time. The maximum heap size of 512 MB does not provide sufficient memory for the SIP traffic workload. Set the maximum heap size for SIP applications to the minimum recommended value of 768 MB or higher.
- Symptom: You receive a 403 Forbidden when sending a SIP request to a SIP container.
Solution: This usually means there is no appropriate SIP application found to handle the received SIP request (no match rule that matched the message).
Subtopics
- Tracing a SIP container
We can trace a SIP container, starting either immediately or after the next server startup. This tracing writes a record of SIP events to a log file.- Tracing a SIP proxy server
We can trace a SIP proxy server, starting either immediately or after the next server startup.
Browse all SIP topics Use High Performance Extensible Logging to troubleshoot applications