Configure emergency throttle
The on demand router and associated autonomic managers are able to support business goals in times of intense request flows by making smart decisions about the work coming into the server. The autonomic request flow manager (ARFM) controls HTTP request prioritization in the ODR. At times, emergency conditions result when certain sensors detect such overloaded situations. These overload situations include extremely high node utilization, intermittent communication failures between ARFM controller and request scheduling gateways, and intermittent communication failures between AsyncPMI monitoring data producers and the gateways. To prevent prolonging of these conditions, if they occur, and the accompanying degradation in performance, the gateways are equipped with emergency throttle controllers that control, and safeguard request dispatch rates to backend nodes. ARFM is handled in the back end for IIOP/JMS requests.
The ARFM contains two parts: a controller and a gateway. The ARFM function is implemented, for each node group, by a controller plus a collection of gateways in the ODRs. The ARFM controller (triggered by the eWLM controller if available on the system) might initiate typical throttling directives to the gateways. In a typical mode, throttling directives come from the ARFM controller by way of the RatesMessages, and are immediately enforced at the gateway by the throttle controller.
A throttle is attached to each queue in the gateway, and is not in the throttle state by default. When an emergency occurs or when rate messages arrive from the ARFM controller, it receives directives from the throttle controller and changes to the throttled state.
If one or more overload sensors detect overload condition, despite typical throttling, the gateway throttle controller enters emergency mode. An emergency blackout sensor senses communication failures between an ARFM controller and request scheduling gateways, or communication failures between AsyncPMI monitoring data producers and the gateways. The term blackout means that the sensor does not receive expected messages. In emergency mode, the throttle controller gradually reduces the dispatch rates of the gateway queues until the overloaded sensors stops firing. Then it gradually restores the rates to their original, pre-emergency mode settings. While restoring the rates, the throttle controller ensures that rate directives from ARFM controller are never exceeded, thus preserving the integrity of throttling decisions made by different controllers. Working together, these components can properly limit incoming requests.
Multiple sensors detect emergency conditions, resulting in the throttle controller going into emergency mode. Each sensor can be in one of two states: fired or unfired. During an emergency, there are two phases for the throttle controller: emergency_throttle and emergency_unthrottle. During the emergency_throttle phase, the throttle reduces all queue rates as long as one of the sensors still fires. In the emergency_unthrottle phase, all the sensors return to the unfired state and gradually restore all queue rates to their original values they had before entering emergency mode.
Emergency throttling is disabled by default. Enable emergency throttling only if IBM Support instructs you to do so. ARFM4998W messages might still be displayed in the log if an emergency condition is detected, however this condition does not throttle traffic. We can enable emergency throttling logging on to the ODR host by editing...
WAS_HOME/profiles/node/properties/arfm.cfg
...and adding the following entry...
EnableEmergencyThrottling=true
Enforcing rate directives from ARFM controller (initiated by eWLM) is enabled by default. We can disable it by adding the following entry to the arfm.cfg file.
EnableExternalThrottling=false
See the following list for other configuration parameters that we can add to the arfm.cfg file.
- EmergencyRateChangeStep=x where x is an integer in the range 0 - 100, specifying the percentage change in rate in each step of the gradual reduction/increase of throttle rate. The default is 20.
- EmergencyRateChangeInterval=x where x is the time between two successive rate change steps in emergency mode in milliseconds. The default is 15000.
- EmergencyBlackoutMultiplier=x where x is a multiplier multiplied by different normal message cycles used as input to emergency blackout sensor. The EmergencyBlackoutMultiplier parameter is a configuration parameter which tells the sensor indirectly how long to wait before firing. This interval is determined as the product (multiplication) of the parameter and the normal anticipated interval between successive messages. The default is 2.
- EmergencyCPUUtilLimit=x where x is an integer in the range of 0 -100, specifying the processor utilization watermark on backend nodes, that triggers emergency throttling. The default is 100.
- TokenBucketSizeMillis=x where x is the number of tokens that can be accumulated in the token bucket queue. The default is 1000.
Configure the autonomic request flow manager