SpikeArrest policy

The Spike Arrest policy protects against traffic spikes with the <Rate> element. This element throttles the number of requests processed by an API proxy and sent to a backend, protecting against performance lags and downtime. See also "How spike arrest works", below.

Videos

These videos show you how to protect your APIs against traffic spikes using the Spike Arrest policy:

Why You Need It

How to Configure

Attach Policy

Per Second Rate Limiting

Per Minute Rate Limiting

Unique Rate Limiting

Compare Quota Policy

Flow Variables

Samples

Per second

<SpikeArrest name="SpikeArrest">
  <Rate>5ps</Rate>
</SpikeArrest>

5 per second. The policy smoothes the rate to 1 request allowed every 200 milliseconds (1000 / 5).

Per minute

<SpikeArrest name="SpikeArrest">
  <Rate>12pm</Rate>
</SpikeArrest>

12 per minute. The policy smoothes the rate to 1 request allowed every 5 seconds (60 / 12).

With message weight

<SpikeArrest name="SpikeArrest">
  <Rate>12pm</Rate>
  <Identifier ref="client_id" />
  <MessageWeight ref="request.header.weight" />
</SpikeArrest>

12 per minute (1 request allowed every 5 seconds, 60 / 12), with message weight that provides additional throttling on specific clients or apps (captured by the Identifier).

Rate from variable

<SpikeArrest name="SpikeArrest">
  <Rate ref="request.header.rate" />
</SpikeArrest>

Setting rate with a variable in the request. The variable value must be in the form of {int}pm or {int}ps. For example:

curl http://myorg-myenv.apigee.net/price -H 'rate:30ps'

Rate from product

Check out this Apigee Community post that explains how to set the spike arrest rate using custom variables set in an API product.


Element reference

Following are elements and attributes you can configure on this policy.

<SpikeArrest async="false" continueOnError="false" enabled="true" name="Spike-Arrest-1">
    <DisplayName>Custom label used in UI</DisplayName>
    <Rate>30ps</Rate>
    <Identifier ref="request.header.some-header-name"/>
    <MessageWeight ref="request.header.weight"/>
    <UseEffectiveCount>true</UseEffectiveCount>
</SpikeArrest>

<SpikeArrest> attributes

<SpikeArrest async="false" continueOnError="false" enabled="true" name="Spike-Arrest-1">

The following table describes attributes that are common to all policy parent elements:

Attribute Description Default Presence
name

The internal name of the policy. The value of the name attribute can contain letters, numbers, spaces, hyphens, underscores, and periods. This value cannot exceed 255 characters.

Optionally, use the <DisplayName> element to label the policy in the management UI proxy editor with a different, natural-language name.

N/A Required
continueOnError

Set to false to return an error when a policy fails. This is expected behavior for most policies.

Set to true to have flow execution continue even after a policy fails.

false Optional
enabled

Set to true to enforce the policy.

Set to false to "turn off" the policy. The policy will not be enforced even if it remains attached to a flow.

true Optional
async

This attribute is deprecated.

false Deprecated

<DisplayName> element

Use in addition to the name attribute to label the policy in the management UI proxy editor with a different, natural-language name.

<DisplayName>Policy Display Name</DisplayName>
Default:

N/A

If you omit this element, the value of the policy's name attribute is used.

Presence: Optional
Type: String

<Rate> element

Specifies the rate at which to limit traffic spikes (or bursts). Specify a number of requests that are allowed in per minute or per second intervals. However, keep reading for a description of how the policy behaves at runtime to smoothly throttle traffic. See also "How spike arrest works", below.

<Rate>10ps</Rate>
<Rate>30pm</Rate>
<Rate ref="request.header.rate" />
Default N/A
Presence Required
Type Integer
Valid values
  • {int}ps (number of requests per second, smoothed into intervals of milliseconds)
  • {int}pm (number of requests per minute, smoothed into intervals of seconds)

Attributes

Attribute Description Default Presence
ref

A reference to the variable containing the rate setting, in the form of {int}pm or {int}ps.

N/A Optional

<Identifier> element

Uniquely identifies and applies spike arrest against individual apps or developers. You can use a variety of variables to indicate a unique developer or app, whether you're using custom variables or predefined variables, such as those available with the Verify API Key policy. See also the Variables reference.

Use in conjunction with <MessageWeight> for more fine-grained control over request throttling.

If you don't use this element, all calls made to the API proxy are counted for spike arrest.

This element is also discussed in the following Apigee Community post: http://community.apigee.com/questions/2807/how-does-the-edge-quota-policy-work-when-no-identi.html.

<Identifier ref="client_id"/>
Default N/A
Presence Optional
Type String

Attributes

Attribute Description Default Presence
ref

A reference to the variable containing the data that identifies the app or developer.

N/A Required

<MessageWeight> element

Use in conjunction with <Identifier> to further throttle requests by specific clients or apps.

Specifies the weighting defined for each message. Message weight is used to modify the impact of a single request on the calculation of the Spike Arrest limit. Message weight can be set by variables based on HTTP headers, query parameters, or message body content. For example, if the Spike Arrest Rate is 10pm, and an app submits requests with weight 2, then only 5 messages per minute are permitted from that app.

<MessageWeight ref="request.header.weight"/>
Default N/A
Presence Optional
Type Integer

Attributes

Attribute Description Default Presence
ref

A reference to the variable containing the message weight for the specific app or client.

N/A Required

<UseEffectiveCount> element

Instructs Edge to automatically distribute your Spike Arrest counts across message processors (MPs) when using auto-scaling groups.

The following example sets <UseEffectiveCount> to true:

<SpikeArrest name='SA1'>
  <Rate>40ps</Rate>
  <UseEffectiveCount>true</UseEffectiveCount>
</SpikeArrest>

The <UseEffectiveCount> element is optional. The default value is false when the element is omitted from your Spike Arrest policy.

Default false
Presence Optional
Type Boolean
Valid values
  • false (the default)
  • true

Attributes

Attribute Description Default Presence
ref

A reference to the variable containing the value of <UseEffectiveCount>.

N/A Optional

When <UseEffectiveCount> is set to true, an MP's spike rate limit is the <Rate> divided by the current number of MPs. The aggregate limit is the value of <Rate>. When MPs are dynamically added (or removed), their individual spike rate limits will increase (or decrease), but the aggregate limit will stay the same.

When <UseEffectiveCount> is set to false (or omitted, as this is the default value), each MP's spike rate limit is simply the value of its <Rate>. The aggregate limit is the sum of the rates of all the MPs. When MPs are added (or removed), their individual spike rate limits will stay the same, but the aggregate limit will increase (or decrease).

The following table shows the effect of <UseEffectiveCount> on the effective rate limit of each MP:

In this example, notice that when the number of MPs is decreased from 4 to 2, and <UseEffectiveCount> is false, the effective rate per MP stays the same (at 10). But when <UseEffectiveCount> is true, the effective rate per MP goes from 10 to 20 when the number of MPs is decreased from 4 to 2.

How Spike Arrest works

Think of Spike Arrest as a way to generally protect against traffic spikes rather than as a way to limit traffic to a specific number of requests. Your APIs and backend can handle a certain amount of traffic, and the Spike Arrest policy helps you smooth traffic to the general amounts you want.

The runtime Spike Arrest behavior differs from what you might expect to see from the literal per-minute or per-second values you enter.

For example, say you enter a rate of 30pm (30 requests per minute). In testing, you might think you could send 30 requests in 1 second, as long as they came within a minute. But that's not how the policy enforces the setting. If you think about it, 30 requests inside a 1-second period could be considered a mini spike in some environments.

What actually happens, then? To prevent spike-like behavior, Spike Arrest smooths the number of full requests allowed by dividing your settings into smaller intervals:

  • Per-minute rates get smoothed into full requests allowed in intervals of seconds.
    For example, 30pm gets smoothed like this:
    60 seconds (1 minute) / 30pm = 2-second intervals, or 1 request allowed every 2 seconds. A second request inside of 2 seconds will fail. Also, a 31st request within a minute will fail.
  • Per-second rates get smoothed into full requests allowed in intervals of milliseconds.
    For example, 10ps gets smoothed like this:
    1000 milliseconds (1 second) / 10ps = 100-millisecond intervals, or 1 request allowed every 100 milliseconds. A second request inside of 100ms will fail. Also, an 11th request within a second will fail.

There's more: 1 request * number of message processors

By default, Spike Arrest is not distributed (unless you enable <UseEffectiveCount>): request counts are not synchronized across MPs. With more than one message processor, especially those with a round-robin configuration, each handles its own Spike Arrest throttling independently. With one message processor, a 30pm rate smooths traffic to 1 request every 2 seconds (60 / 30). With two message processors (the default for Edge cloud), that number doubles to 2 requests every 2 seconds. So multiply your calculated number of full requests per interval by the number of message processors to get your overall arrest rate.

What is the difference between spike arrest and quota

Quota policies configure the number of request messages that a client app is allowed to submit to an API over the course of an hour, day, week, or month. The quota policy enforces consumption limits on client apps by maintaining a distributed counter that tallies incoming requests.

Use a quota policy to enforce business contracts or SLAs with developers and partners, rather than for operational traffic management. Use spike arrest to protect against sudden spikes in API traffic. See also Comparing Quota, Spike Arrest, and Concurrent Rate Limit Policies.

Usage notes

  • In general, you should use Spike Arrest to set a limit that throttles traffic to what your backend services can handle.
  • See also "How spike arrest works".

Schemas

Flow variables

When a Spike Arrest policy executes, the following Flow variable is populated.

For more information about Flow variables, see Flow variables reference.

Variable Type Permission Description
ratelimit.{policy_name}.failed Boolean Read-Only

Indicates whether or not the policy failed (true or false).


Error reference

This section describes the fault codes and error messages that are returned and fault variables that are set by Edge when this policy triggers an error. This information is important to know if you are developing fault rules to handle faults. To learn more, see What you need to know about policy errors and Handling faults.

Runtime errors

These errors can occur when the policy executes.

Fault code HTTP status Cause Fix
policies.ratelimit.FailedToResolveSpikeArrestRate 500 This error occurs if the reference to the variable containing the rate setting within the <Rate> element cannot be resolved to a value within the Spike Arrest policy. This element is mandatory and used to specify the spike arrest rate in the form of {int}pm or {int}ps. build
policies.ratelimit.InvalidMessageWeight 500 This error occurs if the value specified for the <MessageWeight> element through a flow variable is invalid (a non-integer value). build
policies.ratelimit.SpikeArrestViolation 500 The rate limit is exceeded.

Deployment errors

These errors can occur when you deploy a proxy containing this policy.

Error name Cause Fix
InvalidAllowedRate If the spike arrest rate specified in the <Rate> element of the Spike Arrest Policy is not an integer or if the rate does not have ps or pm as a suffix, then the deployment of the API proxy fails. build

Fault variables

These variables are set when a runtime error occurs. For more information, see What you need to know about policy errors.

Variables Where Example
fault.name="fault_name" fault_name is the name of the fault, as listed in the Runtime errors table above. The fault name is the last part of the fault code. fault.name Matches "SpikeArrestViolation"
ratelimit.policy_name.failed policy_name is the user-specified name of the policy that threw the fault. ratelimit.SA-SpikeArrestPolicy.failed = true

Example error response

{  
   "fault":{  
      "detail":{  
         "errorcode":"policies.ratelimit.SpikeArrestViolation"
      },
      "faultstring":"Spike arrest violation. Allowed rate : 10ps"
   }
}

Example fault rule

<FaultRules>
    <FaultRule name="Spike Arrest Errors">
        <Step>
            <Name>JavaScript-1</Name>
            <Condition>(fault.name Matches "SpikeArrestViolation") </Condition>
        </Step>
        <Condition>ratelimit.Spike-Arrest-1.failed=true</Condition>
    </FaultRule>
</FaultRules>

The current HTTP status code for exceeding the rate limit is 500, but it will soon be changed to 429. Until the change occurs, if you are want the status code to be 429, a property needs to be set on your organization (features.isHTTPStatusTooManyRequestEnabled). If you're a cloud customer, contact Apigee Support to have the property enabled. See this community article for guidance on the upcoming change.

Related topics