OverLoad Protection Layer


1. Introduction

1.1. Project/Component Working Name:

Sailfin 2.0/Overload Protection Layer

1.2. Name(s) and e-mail address of Document Author(s)/Supplier:

        Robert.Handl@ericsson.com

02Version

Date

Comments

Author

0.1

2009-01-26

First version

Robert Handl

0.22009-02-27Added http-olp and admin changesRamesh Parthasarathy
0.32009-03-02Modified dynamic configuration support section to reflect the detailsRamesh
Parthasarathy


1.3. Date of This Document:

2009-01-26

2. Project Summary

2.1. Project Description:

The CPU based Overload Protection mechanism available in SGSC protects the system from getting to much load. Unfortunately the protection system is too sensitive and reacts on every small CPU spike. An algorithm is needed to minimize the spikes and reduce the number of alarms generated.

Currently there is a poor way of notifications of overload within SGSC. Only warnings are logged when overload is detected. A notification mechanism should be introduced for clients to observe changes from normal load to overload and vice versa.

Http overload protection : Current behavior

1. Http overload protection is implemented on top of the Clb, and so it is available only in a system where the converged load balancer is present and enabled. The constraint is in place because the OLP has to be invoked before the Clb in the http request processing chain. Since the http-proxy in clb is implemented on top of the Grizzly connector, a HttpLayer interface was introduced to support request interceptions in the Http path. Also, the OLP is present in both the FE and the BE and does not have a way to figure out if it is a FE or BE (because its before the Clb), so this information (FE or BE) cannot be used for initiating an action during overload conditions. The OLP is a standalone module and is not aware of other instances in the cluster and cannot work collaboratively.

2.The OLP triggering algorithm is shared between the Http and SIP parts. When an overload situation occurs, a 503 (with a retry-after) is sent to the client.  This was done because Http clients expect responses for requests that are sent and would be forced to timeout otherwise (if there is no response). And unlike SIP, the response is sent using the same thread that is used for request processing.

2.2. Risks and Assumptions:

Http olp:
The system may be overloaded because of different reasons contributed by various modules in the system including but not limited to SIP and Http. The action taken by the Http-Olp would be restricted to controlling the http (web container) behavior and so would be successful only if the contributing factor (for overload) is an http request. This is the general understanding and would continue to remain so

3. Problem Summary

3.1. Problem Area:

The Ericsson Presence and Group Management application (PGM) has discovered that the current Overload Protection mechanism available in SGCS is not good enough for overload detection and reporting.

The main criticism is that the system is too sensitive for CPU measurement: if an overload is detected and an alarm is triggered it could unfortunately cease and be raised again over and over again during short periods when the CPU is oscillating heavily. It is fine if the traffic is toggled on/off to accept and reject traffic in a fast manner but the reporting should be less sensitive.

The SGCS and the MMAS products also needs better separation of functionality.

Currently SGCS detects overload for CPU and memory and rejects HTTP and SIP traffic when overloaded. The only reporting that exist today is that SGCS adds a WARNING statement into the log file. The MMAS Alarm handler duplicates the behaviour of detecting overload in its life cycle module for reporting alarms.

A future system should align the behaviour. SGCS should reject traffic and notify overload detection via a notification API stating what has caused the overload and the severity of the overload. A client (MMAS Alarm handler) could then register a notification listener to get overload notifications. The MMAS Alarm handler could e.g. filter incoming notifications and report MMAS alarms.

Http olp:

P1 (must have) : The behavior described in (2)  under Decription is not aligned with SIP and is not acceptible when the system is running under maximum load and does not have resources to spare to send responses back to the client. So, the http-olp behavior during maximum overload has to be aligned with SIP behavior of not sending back any response and releasing the resources immediately so as to contribute to reducing the load on the system. A mechanism to ensure that the threads are not released back to the pool immediately would also ensure that the further requests coming into the system are throttled automactically. The action that is taken during the overloaded phase should be such that it aids in protecting the system from a total failure.

Lower priority (nice to have) : The dependency described in (1) under Description has to be eliminated so that the http-olp can be enabled/configured independent of the clb in a system. This would help accomplish use cases where the olp has to be available in a pure backend system.

3.2. Justification:

4. Technical Description:

4.1. Details:

4.1.1. Overload Detection Algorithms

The Overload detection algorithm should be changed to provide two different modes: CONSECUTIVE and MEDIAN. CONSECUTIVE is the same as current option with the addition that all samples below threshold are also counted before ceasing alarm. Currently the alarm is ceased as soon as one sample is below the threshold making it extremely sensible.


Two different algorithms for detecting an overload situation (and the end of it) should be implemented:


4.1.2. Enhanced reporting flexibility

To enhance the flexibility of the overload mechanism it should be divided into the tasks: detection, reporting and listener.

The Overload mechanism should be separated into a detection unit with a reporter which notifies all listeners of an overload event when overload is raised or ceased. The event will include the type of algorithm causing the overload and the traffic type (SIP, HTTP, etc). The action taken by the listener is up to each implementation of the listener. The rejection listener will reject or drop traffic. The logging listener will log warning statements. Example of other possible listeners: the JMX notification listener could send JMX notifications; the MMAS listener could report MMAS alarms, etc.



4.1.3 Dynamic configuration impact:

The http olp configuration options would remain dynamically configurable to preserve backward compatibility, however enabling/disabling the http olp would require a restart. Enabling of olp (both sip and http) is a 2 step process,
Step1: the olpInserted property (or attribute) has to be true , and this will insert the olp layer in the chain of layers in http and sip path. This cannot be toggled dynamically as altering the layer chain requires a restart.
Step 2: After the olp is added in the layer chain, the Cpu or memory olp flag should be set to true, this truly enables OLP because if these are not set the OLP will just act as a pass through. These (as well as the threshold values) can be dynamically configured.

 This is not a change in behavior but just that the existing behavior is captured here

4.1.4 Implementation options for Http OLP

Following are the implementation options available

1. During maximum overload situations, we can drop all further http requests and close the associated connection until the system returns back to normalcy. However, client(s) can still open new connections to the server and send more requests. Since all requests are parsed before they reach the OLP , this again would contribute to the CPU utilization which does not help an overloaded system. So, this option is a simple but reactive approach which will not yield expected results if the client continues to send more requests.

2. The main drawback of 1 being that the threads are released back to the pool immediately and are available to service new requests. The following can be done to improve the situation that arises out of 1.
    a. Reject new requests at maximum overload and close the connection. Introduce a wait time (configurable sleep) in the request processing thread, so that the thread is not released back to the pool immediately. This would ensure that even if there are new requests, there are not enough threads to process them immediately. The sleep is removed once the overload alarm is cleared.
    b. As a result of (a) the requests would get queued at the connector (because there are insufficient threads). The number-of-requests that are allowed to be queued is configurable, in other words the connector will reject requests and close the connection if it receives a request when the queue is full. This is the most efficient way for dropping a request because its done at the earliest stage even before a thread is allocated to process the http request bytes. One drawback (or nice to have ) is that we don't have  counter to count the requests rejected at this level.
This is also straight-forward to implement.

3. Have a mechanism to configure the thread pool so that it can be re sized dynamically. During overload conditions, the size of the thread pool can be reduced so that there are fewer threads to process the request, this would cause requests to be queued and 2b would ensure that the requests beyond a certain number stand rejected. The thread pool is brought back to the original value when the alarm is cleared. It would be difficult to implement this because it requires modifications to the Grizzly Pipeline (thread pool) code.

4. Make OLP cluster aware : This is not a direct solution (just another possibility of improving the OLP and is applicable to both Http and SIP) , but would help a clustered setup in distributing the traffic when a backend is overloaded. More of improving QoS in a cluster setup rather than system protection. Today if one instance (which is both an FE and BE) is overloaded, then it rejects all the FE traffic i received and also all the BE traffic that is directed to it from other instances. Though its not possible to control the FE traffic to the system, the BE traffic can still be controlled if this instance in some way is able to convey its state to the other FEs in the system and the Clb in those FEs can  ensure that the overloaded system is not selected as a BE. This is a pro-active approach and would improve the possibility of a request getting served even of one (or more) instances in a system is overloaded.
This needs to be discussed further and at the minimum  require changes to the CLB....

After dicussions it was agreed to implement option 2 in the following way

a. When the http olp receives a message under maximum overload condition, the http olp would indicate (to the proxy code) that there should be no response for this request. This can be done by setting the response object to null.  
b. Upon receiveing a null as response, the http proxy would hold the thread (sleep) for a configured time interval and then close the connection and release the thread.
c. The http olp would also optionally (configuration) be able to send a 503 under maximum overload conditions to preserve the older behavior. However, the default behavior would be to drop the response.

4.1.5 Configuration of olp in domain.xml :
Today most of the olp configuration is a propery in domain.xml. It is desirable to move this to a separate xml fragment under a new element <overload-protection-service> . Since the olp configuration is common to both http and sip it cannot reside under either of the services and has to be at the same level of the http and sip service. Also all olp configuration is for the entire configuration and not for a particular listener. This is common to http and sip.

<overload-protection-service enabled="true" cpu-overload-protection="true" memory-overload-protection="true" mm-threshold-http-wait-time="2" ir-threshold="35" ..... />

DTD changes

<!ELEMENT config
    (sip-service?, http-service, overload-protection-service, iiop-service, admin-service,                 
    connector-service?, web-container, ejb-container, mdb-container,          
    sip-container?, jms-service?, log-service, security-service,              
    transaction-service, monitoring-service, diagnostic-service?, java-config,
    availability-service?, thread-pools, alert-service?,                      
    group-management-service?, management-rules?, system-property*, property*)>

<!ELEMENT overload-protection-service>

<!ATTLIST overload-protection-service,
    enabled  %boolean; "false"
            Denotes whether the overload-protection manager layer is inserted in the layer chain. This value cannot be dynamically modified. 
    cpu-overload-protection %boolean; "false"
            Denotes if the cpu overload protection is enabled/disabled. Can be dynamically configured.
    memory-overload-protection %boolean; "false"
            Denotes if the memory overload protection is enabled/disabled. Can be dynamically configured.
   cpu-ir-threshold CDATA "70"
            
Threshold of the cpu for initial requests (range 0-100%). A 503 response will be returned when this level is reached.
    mem-ir-threshold
CDATA "85"
            Sets the threshold of the memory for initial requests (range 0-100%). A 503 response will be returned when this level is reached
    cpu-sr-threshold CDATA "90"
           
Sets the threshold of the cpu for subsequent requests (range 0-100%).
    mem-sr-threshold CDATA "85"
            Sets the threshold of the memory for initial requests (range 0-100%). A 503 response will be returned when this level is reached
    cpu-http-threshold CDATA "70"
           
Sets the threshold of the cpu for http requests (range 0-100%)
    mem-http-threshold CDATA "85"
           
Sets the threshold of the memory for http requests (range 0-100%)
    cpu-mm-threshold CDATA "99"
           
Sets the threshold of the cpu for max load possible for handling messages (range 0-100%), for both Http and Sip. The message, request/response will be dropped in case of SIP/Http.
    mem-mm-threshold CDATA "99"
           
Sets the threshold of the memory for max load possible for handling messages (range 0-100%) for both Http and Sip. The message, request/response will be dropped in case of SIP/Http.
    sample-rate CDATA "2"           
            Sets the sample rate of updating the overload protection levels. Must be a positive valu
     number-of-samples CDATA "5"
            
Sets the number of consequence samples that is needed before overload is  raised. The sample rate could minimum be set to 2.
     retry-after-interval CDATA "10"
              Value updated in the retry after of the response message
     cpu-overload-activation-algorithm 
CDATA  "MEDIAN"   
                Determines the algorithm used when deciding to activate CPU
overload protection. Range {CONSECUTIVE, MEDIAN}
     cpu-overload-deactivation-algorithm CDATA "CONSECUTIVE"
                Determines the algorithm used when deciding to deactivate CPU
overload protection. Range {CONSECUTIVE, MEDIAN}
     mem-overload-activation-algorithm CDATA  "MEDIAN" 
                Determines the algorithm used when deciding to activate memory
overload protection. Range {CONSECUTIVE, MEDIAN}
     mem-overload-deactivation-algorithm CDATA "CONSECUTIVE"
                Determines the algorithm used when deciding to deactivate memory
overload protection. Range {CONSECUTIVE, MEDIAN}
    mm-threshold-http-wait-time CDATA "2"  
                Denotes the time-interval for which the thread is held before being released back to the pool. Used in the context of maximum overload for http requests
  >

Description

The 1.5 method of configuring OLP through properties would still be supported for the sake of backward compatibility


4.2. Bug/RFE Number(s):

4.2.1 Bug/RFE Numbers from Issue Tracker

               

Issue 1581

4.2.2 Requirememt Ids that are being addressed as a part of this proposal.

                105 65-0192/03709, 105 65-0192/03405, 105 65-0192/03152,
                105 65-0192/01601

4.3. In Scope:

4.4. Out of Scope:


4.5. Interfaces:


4.5.1 Exported Interfaces

OverloadEvent, OverloadListener

4.5.2 Imported interfaces


4.5.3 Other interfaces (Optional)


4.6. Doc Impact:

Update of existing Overload documentation

4.7. Admin/Config Impact:

Please see domain.xml changes

4.7.1 Configuration changes needed

New configuration for the modes: CONSECUTIVE or MEDIAN

4.7.2 CLI / GUI impact if any

4.8. HA Impact:


4.9. I18N/L10N Impact:


4.10. Packaging & Delivery:

.

4.10.1 Binaries in which the code is delivered

comms-appservr-rt

4.11. Security Impact:


4.12. Compatibility Impact

OLP triggering logic is different, so the expectations would be different with olp in this release.


4.13. Dependencies:


4.13.1 Changes required in GlassFish


4.13.2 Third Party APIs


4.14 Miscellaneous

Will this component work with JDK 64bit Yes
Will this component require configuration using a sun-specific deployment descriptor.If yes, please specify below that configuration elements needed No

5. Open Issues

Issue No Description Comments Resolution
1

How to configure the parameters of the SGCS Overload Protection layer and the MMAS Alarm life cycle module in a consistent manner?



6. Reference Documents:

                    http://weblogs.java.net/blog/rampsarathy/archive/2008/07/overload_protec_1.html

7. Schedule:

7.1. Projected Availability:

Sailfin 2.0