Handling faults

You're viewing Apigee Edge documentation.
Go to the Apigee X documentation.
info

Many error conditions can arise while API proxies are servicing requests from apps. For example, API proxies might encounter network issues when communicating with backend services, apps might present expired credentials, request messages might be incorrectly formatted, and so on.

When an error occurs after a client app calls an API proxy, an error message gets returned to the client. By default, the client receives an often cryptic error message with no details or guidance. But if you want to replace default error messages with more useful custom messages, and even enrich them with things like additional HTTP headers, you need to set up custom fault handling in Edge.

Custom fault handling also lets you add functionality such as message logging whenever an error occurs.

Before we talk about implementing custom error handling in your API proxies, it's helpful to understand how errors occur and how API proxies react to them.

Videos

Watch the following videos to learn more about fault handling.

Video Description
Introduction to fault handling and error flows Learn about fault handling and what happens when an error occurs in an API proxy.
Handle faults using fault rules Learn how to handle faults using fault rules.
Raise custom faults using the RaiseFault policy Raise custom faults during API runtime using RaiseFault policy.
Define fault rules in API proxy and target endpoints Define fault rules in the API proxy and target endpoints and understand the differences.
Understand execution order of fault rules Understand the execution order of fault rules in the API proxy and target endpoints.
Define default fault rule Define default fault rule to handle generic errors in your API.

How errors occur

First we'll simply cover how errors occur. Knowing how errors occur helps you plan for the different situations in which you want to implement custom error handling.

Automatic errors

An API proxy throws an error automatically in the following situations:

  • A policy throws an error. For example, if an API call sends an expired key, the VerifyAPIKey policy automatically throws an error; or if the number of API calls exceeds a certain limit, the Quota policy or SpikeArrest policy throws an error. (See the Policy error reference for the types of errors policies can throw).
  • There's an issue in the API proxy message flow, such as a routing error.
  • There's a backend failure, such as an HTTP error due to protocol level failures, TLS/SSL errors, or an unavailable target service.
  • There's a system-level failure, such as an out-of-memory exception.

For more information on these errors, see Fault taxonomy in this topic.

Custom errors

For situations where there's not an automatic error, you may want to throw a custom error; for example, if a response contains the word "unavailable", or if the HTTP status code is greater than 201. Do this by adding a RaiseFault policy to the appropriate place in an API proxy flow.

You can add a RaiseFault policy to an API proxy flow the same as you do any other policy. In the following proxy configuration example, the Raise-Fault-1 policy is attached to the TargetEndpoint response. If the word "unavailable" is present in the response from the target service, the RaiseFault policy gets executed and throws an error.

<TargetEndpoint name="default">
...
  <Response>
    <Step>
      <Name>Raise-Fault-1</Name>
      <Condition>(message.content Like "*unavailable*")</Condition>
    </Step>
  </Response>

This is just to show you that you can throw custom errors. We go into more detail about the RaiseFault policy in the FaultRules vs. the RaiseFault policy section.

For more examples, see these posts on the Apigee Community Forums:

What API proxies do when errors occur

Here's what happens when a proxy throws an error.

Exit the proxy pipeline

When an API proxy encounters an error, regardless of how it occurs, it exits the normal flow pipeline, enters an error state, and returns an error message to the client app. Once the API proxy enters the error state, it cannot return processing back to the normal flow pipeline.

For example, assume an API proxy has policies in the following order in the ProxyEndpoint request:

  1. Verify API Key
  2. Quota
  3. JSON to XML

If an error occurs during API key verification, the API proxy moves into an error state. The Quota and JSON to XML policies are not executed, the proxy doesn't proceed to the TargetEndpoint, and an error message is returned to the client app.

Check for FaultRules

In the error state, API proxies also check for the presence of the following (in order) in the API proxy configuration before returning a default error message to the client app:

  1. A <FaultRules> section, which contains the logic to trigger custom error messages (and other policies) based on specific conditions that you define.
  2. A <DefaultFaultRule> section, which triggers a default error message in the following situations:
    • No <FaultRules> are defined.
    • No existing <FaultRules> get executed.
    • The <AlwaysEnforce> element is set to true.

In essence, the API proxy is giving you the opportunity to return a custom error message and trigger other logic. If the proxy finds neither of those sections, or they exist but no custom fault was triggered, the proxy sends its own Edge-generated default message.

Simple fault handling example

Let's start with a simple example, where a call to an API proxy doesn't contain a required API key. By default, following is the response that gets returned to the client app:

HTTP/1.1 401 Unauthorized
Date: Wed, 20 Jul 2016 19:19:32 GMT
Content-Type: application/json
Content-Length: 150
Connection: keep-alive
Server: Apigee Router

* Connection #0 to host myorg-test.apigee.net left intact
{"fault":{"faultstring":"Failed to resolve API Key variable request.queryparam.apikey","detail":{"errorcode":"steps.oauth.v2.FailedToResolveAPIKey"}}}

Your API users may be able to figure out the error message, but they may not. And many default errors are more subtle and harder to decipher.

As an API developer, it's up to you to change this message to meet the needs of whoever will ultimately receive the error message, whether it's an iOS app developer or an internal testing group that has its own error message format requirements.

Here's a basic example of how you'd create a custom error message to handle this error. This requires 1) a policy that defines the custom message, and 2) a FaultRule that executes the policy when the proxy goes into an error state.

1. Create a policy that defines the custom message

First, create a policy that defines the custom error message. You can use any type of policy, such as an AssignMessage policy, that can set a payload and optional HTTP headers such as status code and reason phrase. Assign Message is ideal for this. It lets you control message payload, set a different HTTP status code, set a different HTTP reason phrase, and add HTTP headers.

Don't attach the policy to any flow in the API proxy. It's enough that it simply exists in the proxy bundle. To do this in the management UI proxy editor, go to the Develop tab, and in the Navigation pane and click the + icon on the Policies bar.

This lets you create a policy without attaching it to a flow in the API proxy. A policy that is not attached to any flow is flagged with the "detached" icon in the Policies list, as shown adjacent to the API key message policy shown in the previous figure.

Following is an example AssignMessage policy that:

  • Returns a JSON message.
  • Sets an HTTP status code (911, which is an obvious non-existent status code simply to illustrate the flexibility you have). The status code appears in the HTTP header.
  • Sets an HTTP reason phrase (to replace the default "Unauthorized" reason phrase for this missing API key error). The reason phrase appears next to the status code in the HTTP header.
  • Creates and populates a new HTTP header called invalidKey.
<AssignMessage async="false" continueOnError="false" enabled="true" name="invalid-key-message">
    <DisplayName>Invalid key message</DisplayName>
    <Set>
        <Payload contentType="application/json">{"Citizen":"Where's your API key? I don't see it as a query parameter"}</Payload>
        <StatusCode>911</StatusCode>
        <ReasonPhrase>Rejected by API Key Emergency Services</ReasonPhrase>
    </Set>
    <Add>
        <Headers>
            <Header name="invalidKey">Invalid API key! Call the cops!</Header>
        </Headers>
    </Add>
    <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
    <AssignTo createNew="false" transport="http" type="request"/>
</AssignMessage>

When this policy is executed, the response to the client app will look like the following. Compare it to the default response shown earlier.

HTTP/1.1 911 Rejected by API Key Emergency Services
Date: Wed, 20 Jul 2016 18:42:36 GMT
Content-Type: application/json
Content-Length: 35
Connection: keep-alive
invalidKey: Invalid API key! Call the cops!
Server: Apigee Router

* Connection #0 to host myorg-test.apigee.net left intact
{"Citizen":"Where's your API key? I don't see it as a query parameter."}

Yes, it's a little silly, but it shows you what's possible. At least now the developer receiving the message knows they forgot to include an API key as a query parameter.

But how does this policy get executed? The next section shows you.

2. Create the <FaultRule> that will trigger the policy

In the <ProxyEndpoint> or <TargetEndpoint> sections of the proxy configuration, you'll add a <FaultRules> XML block that contains one or more individual <FaultRule> sections. Each FaultRule represents a different error you want to handle. In this simple example, we'll use only one FaultRule to show you what it's composed of.

You should also add a <DefaultFaultRule> to provide a custom general error message if none of your FaultRules is executed.

Example

<ProxyEndpoint name="default">
...
    <FaultRules>
       <FaultRule name="invalid_key_rule">
            <Step>
                <Name>invalid-key-message</Name>
            </Step>
            <Condition>(fault.name = "FailedToResolveAPIKey")</Condition>
        </FaultRule>
    </FaultRules>
    <DefaultFaultRule name="default-fault">
        <Step>
            <Name>Default-message</Name>
        </Step>
    </DefaultFaultRule>

Key points:

  • The FaultRules are defined in the ProxyEndpoint. This is important. More on putting FaultRules in the ProxyEndpoint vs. TargetEndpoint later.
  • <Name> - The name of the policy to execute. The name comes from the policy's name attribute on the parent element, as shown in the policy example earlier.
  • <Condition> - Edge evaluates the condition and executes the policy only if the condition is true. If there are multiple FaultRules that evaluate to true, Edge executes the first one that is true. (Important: The order in which the FaultRules are evaluated, top to bottom or bottom to top, differs between the TargetEndpoint and ProxyEndpoint, as described in the Multiple FaultRules and execution logic section.) If you don't include a condition, the FaultRule is automatically true. But that's not a best practice. Each FaultRule should have its own condition.

  • <DefaultFaultRule> - If no custom FaultRule is executed, the <DefaultFaultRule> is executed, sending a more generic custom message instead of the cryptic default Edge-generated message. A <DefaultFaultRule> can also have a <Condition>, but in most cases you won't include one, because you want it to execute no matter what as a last resort.

    The DefaultFaultRule is typically used to return a generic error message for any unexpected error. An example would be a message that contains contact information for technical support. This default response serves the dual purpose of providing developer-friendly information while also obfuscating backend URLs or other information that might be used to compromise the system.

Multiple FaultRules and execution logic

In the Simple fault handling example section, we used a simple example of a single FaultRule and condition. In a real-world API project, with all the possible errors that can occur, you're likely to have multiple FaultRules and a DefaultFaultRule in both your <ProxyEndpoint> and <TargetEndpoint>. Ultimately, though, only one FaultRule is executed when an API proxy goes into an error state.

This section describes the logic Edge uses in handling FaultRules, from how it arrives at a single FaultRule to execute to how "inner" Step conditions are handled when their FaultRule is triggered. This section also provides guidance on when to define FaultRules in the <ProxyEndpoint> vs. the <TargetEndpoint>, and describes the relationship between FaultRules and the RaiseFault policy.

FaultRules execution

In brief, here's the logic Edge uses when an API proxy goes into an error state. Note that there is a slight difference between FaultRules evaluation in the ProxyEndpoint versus the TargetEndpoint.

  1. Edge evaluates the FaultRules in either the ProxyEndpoint or TargetEndpoint, depending on where the error occurred:
    • ProxyEndpoint - Edge starts with the bottom <FaultRule> in the configuration XML and works its way up, evaluating the <Condition> of each <FaultRule> (the "outer" condition, not the "inner" <Step> conditions).
    • TargetEndpoint - Edge starts with the top <FaultRule> in the configuration XML and works its way down, evaluating the <Condition> of each <FaultRule> (the "outer" condition, not the "inner" <Step> conditions).
  2. Executes the first FaultRule whose condition is true. If a FaultRule has no condition, it's true by default.
    • When a FaultRule is executed, all Steps inside the FaultRule are evaluated in order, top to bottom in the XML configuration. Steps without conditions are automatically executed (the policies are executed), and Steps that have a <Condition> that evaluates to "true" are executed (conditions that evaluate to "false" are not executed).
    • If a FaultRule is executed, but no Steps in the FaultRule are executed (because their conditions evaluate to "false"), the Edge-generated default error message is returned to the client app. The <DefaultFaultRule> is not executed, because Edge has already executed its one FaultRule.

  3. If no FaultRule is executed, Edge executes the <DefaultFaultRule>, if present.

Following are examples with inline comments.

ProxyEndpoint execution

Evaluation of ProxyEndpoint FaultRules is bottom to top, so start reading at the last FaultRule in the following sample and work your way up. Look at the DefaultFaultRule last.

<ProxyEndpoint name="default">
...
    <FaultRules>
<!-- 3. This FaultRule is automatically TRUE, because there's no "outer" 
     condition. But because the FaultRule just below this got
     executed (bottom-to-top evaluation in a ProxyEndpoint), Edge
     doesn't even evaluate this FaultRule.
     Note that it's not a best practice to have a FaultRule without 
     an outer condition, which automatically makes the FaultRule true. -->
        <FaultRule name="random-error-message">
            <Step>
                <Name>Random-fault</Name>
            </Step>
        </FaultRule>
<!-- 2. Let's say this fault is TRUE. The Quota policy threw a QuotaViolation
     error. This is the first FaultRule to be TRUE, so it's executed. 
     Now the Steps are evaluated, and for the ones whose conditions
     evaluate to TRUE, their policies are executed. Steps without
     conditions are automatically true. -->
<FaultRule name="over_quota">
            <Step>
                <Name>developer-over-quota-fault</Name>
                <Condition>(ratelimit.developer-quota-policy.exceed.count GreaterThan "0")</Condition>
            </Step>
            <Step>
                <Name>global-over-quota-fault</Name>
                <Condition>(ratelimit.global-quota-policy.exceed.count GreaterThan "0")</Condition>
            </Step>
            <Step>
                <Name>log-error-message</Name>
            </Step>
            <Condition>(fault.name = "QuotaViolation")</Condition>
        </FaultRule>
<!-- 1. Because this is the ProxyEndpoint, Edge looks at this FaultRule
     first. But let's say this FaultRule is FALSE. A policy did not 
     throw a FailedToResolveAPIKey error. Edge moves UP to check
     the next FaultRule. -->
        <FaultRule name="invalid_key_rule">
            <Step>
                <Name>invalid-key-message</Name>
            </Step>
            <Condition>(fault.name = "FailedToResolveAPIKey")</Condition>
        </FaultRule>
    </FaultRules>

<!-- If no <FaultRule> is executed, the <DefaultFaultRule> is executed. 
     If a FaultRule is executed, but none of its Steps are executed,
     The DefaultFaultRule is not executed (because Edge has already
     executed its one FaultRule). -->
    <DefaultFaultRule name="default-fault">
        <Step>
            <Name>Default-message</Name>
        </Step>
    </DefaultFaultRule>

TargetEndpoint execution

Evaluation of TargetEndpoint FaultRules is top to bottom, so start reading at the first FaultRule in the following sample and work your way down. Look at the DefaultFaultRule last.

<TargetEndpoint name="default">
...
    <FaultRules>
<!-- 1. Because this is the TargetEndpoint, Edge looks at this FaultRule
     first. Let's say this FaultRule is FALSE. 
     A policy did not throw a FailedToResolveAPIKey error. 
     Edge moves down to the next FaultRule. -->
        <FaultRule name="invalid_key_rule">
            <Step>
                <Name>invalid-key-message</Name>
            </Step>
            <Condition>(fault.name = "FailedToResolveAPIKey")</Condition>
        </FaultRule>
<!-- 2. Let's say this fault is TRUE. The Quota policy threw a QuotaViolation
     error. This is the first FaultRule to be TRUE, so it's executed. 
     Now the Steps are evaluated, and for the ones whose conditions
     evaluate to TRUE, their policies are executed. Steps without
     conditions are automatically true. -->
        <FaultRule name="over_quota">
            <Step>
                <Name>developer-over-quota-fault</Name>
                <Condition>(ratelimit.developer-quota-policy.exceed.count GreaterThan "0")</Condition>
            </Step>
            <Step>
                <Name>global-over-quota-fault</Name>
                <Condition>(ratelimit.global-quota-policy.exceed.count GreaterThan "0")</Condition>
            </Step>
            <Step>
                <Name>log-error-message</Name>
            </Step>
            <Condition>(fault.name = "QuotaViolation")</Condition>
        </FaultRule>
<!-- 3. This FaultRule is automatically TRUE, because there's no "outer" 
     condition. But because the FaultRule just above this got
     executed (top-to-bottom evaluation in a TargetEndpoint), Edge
     doesn't even evaluate this FaultRule.
     Note that it's not a best practice to have a FaultRule without 
     an outer condition, which automatically makes the FaultRule true. -->
        <FaultRule name="random-error-message">
            <Step>
                <Name>Random-fault</Name>
            </Step>
        </FaultRule>
    </FaultRules>

<!-- If no <FaultRule> is executed, the <DefaultFaultRule> is executed. 
     If a FaultRule is executed, but none of its Steps are executed,
     The DefaultFaultRule is not executed (because Edge has already
     executed its one FaultRule). -->
    <DefaultFaultRule name="default-fault">
        <Step>
            <Name>Default-message</Name>
        </Step>
    </DefaultFaultRule>

Fault rule order

As you can see in the previous example, the order in which you put your FaultRules is important depending on whether the error occurs in the ProxyEndpoint versus the TargetEndpoint.

For example:

ProxyEndpoint order TargetEndpoint order

In the following example, since evaluation is bottom to top, FaultRule 3 is executed, which means FaultRules 2 and 1 aren't evaluated.

5. FaultRule 1: FALSE

4. FaultRule 2: TRUE

3. FaultRule 3: TRUE

2. FaultRule 4: FALSE

1. FaultRule: 5 FALSE

In the following example, since evaluation is top to bottom, FaultRule 2 is executed, which means FaultRules 3, 4, and 5 aren't evaluated.

1. FaultRule 1: FALSE

2. FaultRule 2: TRUE

3. FaultRule 3: TRUE

4. FaultRule 4: FALSE

5. FaultRule: 5 FALSE

Policies to include

You can execute any policies from a FaultRule by putting them in Steps. For example, you can execute an AssignMessage policy to format a response to the client app, then log a message with the MessageLogging policy. Policies are executed in the order you put them (top to bottom in the XML).

Fault rules are triggered ONLY in an error state (about continueOnError)

The heading may seem like we're repeating ourselves, but there's one particular nuance to be aware of with regard to a proxy error causing an API proxy to enter an error state—or rather, not entering an error state: the continueOnError attribute on a policy.

To recap: An API proxy evaluates <FaultRules> and <DefaultFaultRule> only if the proxy has entered an error state. That means that even if a FaultRule condition evaluates to true, it won't get triggered if the proxy isn't in an error state.

However, here's an example of an error occurring and the proxy not entering an error state. On any policy, you can set an attribute on the parent element called continueOnError. That attribute is very important with regard to fault handling, because it determines whether or not the proxy enters an error state if the policy fails. In most cases, you'll want to keep the default continueOnError="false", which puts the proxy in an error state if the policy fails, and your custom error handling will get triggered. However, if continueOnError="true" (for example, if you don't want the failure of a Service Callout to stop the proxy execution), the proxy will not go into an error state if that policy fails, and the proxy won't look at your FaultRules.

For information on logging errors when continueOnError="true", see Handling policy faults within the current flow.

Where to define FaultRules: ProxyEndpoint or TargetEndpoint

When an API proxy experiences an error, the error occurs either in the <ProxyEndpoint> (request from or response to client app) or in the <TargetEndpoint> (request to or response from target service). Wherever that error occurs is where Edge looks for FaultRules.

For example, if a target server isn't available (HTTP status code 503), the API proxy would go into an error state in the <TargetEndpoint> response, and the normal API proxy flow wouldn't continue to the <ProxyEndpoint>. If you have FaultRules defined only in the <ProxyEndpoint>, they won't handle that error.

Here's another example. If a RaiseFault policy on the <ProxyEndpoint> response triggers an error, a FaultRule in the <TargetEndpoint> won't get executed.

FaultRules vs. the RaiseFault policy

Fault rules and the RaiseFault policy may on the surface sound like alternative ways to accomplish fault handling; and in some ways that's true. But they also work together. This section explains the relationship between the two. Understanding this relationship should help you design your fault handling, especially if you want to use both.

In brief:

  • Fault rules are always evaluated when an API proxy enters an error state.
  • The RaiseFault policy is a way of putting an API proxy in an error state when an error wouldn't have otherwise occurred.

    For example, if you want to throw an error if the HTTP status code in the response from the target service is greater than 200, you add a RaiseFault policy in your response flow. It would look something like this:

    <TargetEndpoint name="default">
        <PreFlow name="PreFlow">
    ...
            <Response>
                <Step>
                    <Name>Raise-Fault-1</Name>
    <!-- If the condition is true, the Raise-Fault-1 policy gets executed -->
                    <Condition>(response.status.code GreaterThan "200")</Condition>
                </Step>
            </Response> 

    The RaiseFault policy also sends an error message to the client app.

What happens when a RaiseFault policy triggers an error, which puts the proxy in an error state, which potentially executes a FaultRule? Here's where things can get a little tricky. If the RaiseFault policy returns an error message and a FaultRule gets triggered and returns an error message, what gets returned to the client app?

  • Since the FaultRule or DefaultFaultRule is executed after the RaiseFault policy, the FaultRule response data wins.
  • The RaiseFault policy response data (status code, reason phrase, or message payload) is used if that data is not set by the FaultRule or DefaultFaultRule.
  • If both the RaiseFault policy and FaultRule add custom HTTP headers, both are included in the response. Duplicate header names create a header with multiple values.

Here's an example of what's set by a RaiseFault policy and a FaultRule, and what gets returned to the client app. The samples are designed for brevity, not for best practices.

Client app receives:

Status Code: 468
Reason Phrase: Something happened
Payload: {"Whoa":"Sorry."}
Header: 
  errorNote: woops,gremlins

<- Fault rules policy sets this:

Status Code: [none] 
Reason Phrase: Something happened
Payload: {"Whoa":"Sorry."}
Header: 
  errorNote: gremlins

<- RaiseFault policy sets this:

Status Code: 468
Reason Phrase: Can't do that
Payload: {"DOH!":"Try again."}
Header: 
  errorNote: woops

Building conditions

Conditions are the key to FaultRule execution. You create FaultRule conditions the same way you do for other conditions in Edge, such as for conditional flows or RaiseFault conditions.

To put the rest of this section in context, here's a sample fault rule that has an outer FaultRule condition and an inner Step condition.

<FaultRule name="invalid_key_rule">
    <Step>
        <Name>invalid-key-message</Name>
        <Condition>(oauthV2.Verify-API-Key-1.failed = true)</Condition>
    </Step>
    <Condition>(fault.name = "FailedToResolveAPIKey")</Condition>
</FaultRule>

Variables specific to policy errors

The fault.name and {policy_namespace}.{policy_name}.failed variables are available when a policy throws an error.

fault.name

When a policy fails, catch the error in a condition using the fault.name variable. For example:

<Condition>(fault.name = "policy_error_name")</Condition>

The error name appears in the default error message. For example, in the following, the fault name is FailedToResolveAPIKey. In this case, a flow variable called fault.name is set to the value FailedToResolveAPIKey.

{"fault":{"faultstring":"Failed to resolve API Key variable request.queryparam.apikey","detail":{"errorcode":"steps.oauth.v2.FailedToResolveAPIKey"}}}

So the condition would look like this:

<Condition>(fault.name = "FailedToResolveAPIKey")</Condition>

See the Policy error reference for a list of policy errors.

{policy_namespace}.{policy_name}.failed

The *.failed variable is available when a policy fails. Following are examples of *.failed variables for different policies. For policy namespaces, see the flow variables in each policy reference topic.

Other available variables

When an API proxy goes into an error state, the only available variables for use in conditions are:

  • The variables of the policy that failed.
  • The HTTP message variables that exist at the point of failure. For example, if an error is thrown in the response, a FaultRule in the <TargetEndpoint> could use HTTP data response.status.code, message.content, error.content, and so on. Or if a Quota policy failed, you could use the variable ratelimit.{quota_policy_name}.exceed.count. Use the Trace tool and the policy reference topics to help you figure out which variables and HTTP data are available.

More information

Best practices for fault handling

Fault handling is a major architectural design task for API proxy development. It's important to take the time to figure out how and when you're going to handle errors, determine what error messages will say, and design error message formats. After (or as) you figure those things out, then use these best practices to help you with your fault handling implementation.

Following are some best practices in designing and building fault handling:

  • For each FaultRule, provide an "outer" <Condition> (sibling to the <Step> element). Fault rules with no outer condition automatically evaluate to true. "Inner" Step conditions are not used to determine whether a FaultRule is true or false. Step conditions are evaluated only after Edge executes the FaultRule containing them. In a FaultRule, it's typical to have multiple Steps with Assign Message (or other) policies, each with a Step condition.
  • To handle errors in multiple policies of the same type (for example, multiple Quota policies), create one FaultRule per policy error you're likely to receive. For example, create a FaultRule for each likely error in Quota policies, such as QuotaViolation, InvalidMessageWeight, StartTimeNotSupported. (See the Policy error reference for policy errors. As you discover additional errors that need to be handled, you can go back later and add them to your FaultRules. It's okay to be iterative, though it does require proxy redeployment.) This approach allows you to catch the same type of error no matter which policy throws it, which makes your FaultRules XML efficient.

    Then use inner Step conditions if you need more fine-grained error control. For example, if you're enforcing both individual developer quota and global quota with two policies in your request flow, set your "outer" FaultRule condition to get triggered on the QuotaViolation error (which is thrown when quota goes over in either case). Then set Step conditions to evaluate the exceed.count variables in both of your quota policies. Only the relevant error is sent to the client (developer quota overage or global quota overage). Here's an example of this configuration:

    <FaultRule name="over_quota">
    <!-- This condition catches a QuotaViolation in *any* Quota policy -->
      <Condition>(fault.name = "QuotaViolation")</Condition>
      <Step>
        <Name>developer-over-quota-fault</Name>
        <Condition>(ratelimit.developer-quota-policy.exceed.count GreaterThan "0")</Condition>
      </Step>
      <Step>
        <Name>global-over-quota-fault</Name>
        <Condition>(ratelimit.global-quota-policy.exceed.count GreaterThan "0")</Condition>
      </Step>
    </FaultRule>

    For another example, see this Apigee Community thread.

  • To handle errors when you're using a single policy of one type, consider a single fault rule that gets executed when that one policy fails, and include multiple steps that map to each possible error. This keeps your XML efficient by using a single FaultRule rather than multiple FaultRules (one for each error type). For example:

    <FaultRule name="raise-fault-3">
    <!-- This condition catches *any* error in the Verify-API-Key-1 policy. -->
      <Condition>(oauthV2.Verify-API-Key-1.failed = "true")</Condition>
      <!-- This first step always executes, which handles errors you haven't mapped with inner conditions. -->
      <Step>
        <Name>Generic-Key-Fault</Name>
      </Step>
      <Step>
        <Name>Assign-Message-Raise-Fault-1</Name>
        <Condition>(fault.name = "FailedToResolveAPIKey")</Condition>
      </Step>
      <Step>
        <Name>Assign-Message-Raise-Fault-2</Name>
        <Condition>(fault.name = "InvalidApiKey")</Condition>
      </Step>
    </FaultRule>
  • Add FaultRules where the errors will occur (client side <ProxyEndpoint> or target side <TargetEndpoint>). Include FaultRules for each policy that appears in each location.
  • In FaultRules, you can execute any type of policy that can return a message to the client app. The AssignMessage policy is ideal for this. Also consider logging a message with the MessageLogging policy if you want to keep track of errors.
  • When using RaiseFault policies in conjunction with FaultRules, coordinate the response data that's sent back when both the RaiseFault policy and a FaultRule return data. For example, if your RaiseFault policy resets the HTTP status code, don't have a FaultRule reset the status code. The worst that can happen is that the default status code gets returned to the client app.
  • <DefaultFaultRule> execution:
    • If you want a <DefaultFaultRule> to always execute when no other FaultRule executes, don't include a <Condition> on it.
    • If you want a <DefaultFaultRule> to always execute even when another FaultRule has executed, add the <AlwaysEnforce>true</AlwaysEnforce> child element.

Pattern for centralized, reusable fault handling

The following Apigee Community post describes a pattern for centralized fault handling without code duplication:

https://community.apigee.com/articles/23724/an-error-handling-pattern-for-apigee-proxies.html

Creating FaultRules

To add a FaultRule you need to edit the XML configuration of the ProxyEndpoint or TargetEndpoint. You can use the Edge UI to make this edit in the Code pane of the Develop view for an API proxy, or edit the XML file that defines the ProxyEndpoint or TargetEndpoint.

If you create FaultRules in the management UI, first create the policies you want to execute, then add them to the FaultRule configuration. (You'll get an error in the UI if you try to save a FaultRule that references a policy that hasn't been created yet.)

Adding policies to a FaultRule

While you can put any policy in the FaultRule, you commonly use the AssignMessage policy to generates a custom response message for an error condition. AssignMessage enables you to configure an HTTP response with payload, HTTP status code, headers, and reason phrase elements.

The example below shows a typical AssignMessage policy configuration:

<AssignMessage name="fault_invalidkey">
  <Set>
      <Payload contentType="text/plain">Contact support at support@mycompany.com.</Payload>
      <StatusCode>401</StatusCode>
      <ReasonPhrase>Unauthorized</ReasonPhrase>
  </Set>
  <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
</AssignMessage>

You can now use this policy in your FaultRule. Notice how you reference the AssignMessage policy by name in the FaultRule:

<ProxyEndpoint name="default">
  ...
  <FaultRules>
    <FaultRule name="invalid_key_rule">
      <Step>
        <Name>fault_invalidkey</Name>
      </Step>
      <Condition>(fault.name = "InvalidApiKey")</Condition>
    </FaultRule>
  </FaultRules>
</ProxyEndpoint>

When you deploy the configuration above, the API proxy will execute the AssignMessage policy called fault_invalidkey whenever an app presents an invalid API key.

You can execute multiple policies in a FaultRule, as the following example shows:

<ProxyEndpoint name="default">
  ...
  <FaultRules>
    <FaultRule name="invalid_key_rule">
      <Step>
        <Name>policy1</Name>
      </Step>
      <Step>
        <Name>policy2</Name>
      </Step>
      <Step>
        <Name>policy3</Name>
      </Step>
      <Condition>(fault.name = "InvalidApiKey")</Condition>
    </FaultRule>
  </FaultRules>
</ProxyEndpoint>

The policies execute in the order defined. For example, you can use the MessageLogging policy, the ExtractVariables policy, the AssignMessage policy, or any other policy in the FaultRule. Note that processing of the FaultRule stops immediately if either of these situations occur:

  • Any policy in the FaultRule causes an error
  • Any of the policies in the FaultRule is of type RaiseFault

Defining the custom error message returned from a FaultRule

As a best practice, you should define clear error responses from your APIs. In that way, you deliver consistent and helpful information to your clients.

The following AssignMessage policy example uses the <Payload>, <StatusCode>, and <ReasonPhase> tags to define the custom error response sent back to the client on an InvalidApiKey error (see previous FaultRules example).

<AssignMessage name="fault_invalidkey">
  <Set>
    <Payload contentType="text/plain">You have attempted to access a resource without the correct authorization. 
       Contact support at support@mycompany.com.</Payload>
    <StatusCode>401</StatusCode>
    <ReasonPhrase>Unauthorized</ReasonPhrase>
  </Set>
  <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
</AssignMessage>

This response includes:

  • The payload containing the error message and an email address for contacting support.
  • The HTTP status code returned in the response.
  • The reason phrase, which is a short description of the error.

Creating a DefaultFaultRule

A DefaultFaultRule acts an exception handler for any error that is not explicitly handled by another FaultRule. If the conditions for all FaultRules do not match the error, then the DefaultFaultRule handles the error. Enable default fault handling by adding the <DefaultFaultRule> tag as a child element of a ProxyEndpoint or a TargetEndpoint.

For example, the TargetEndpoint configuration below defines a DefaultFaultRule that invokes a policy named ReturnGenericError:

<TargetEndpoint name="default">
  ...
  <FaultRules>
    ...
  </FaultRules>

  <DefaultFaultRule name="fault-rule">
    <Step>
      <Name>ReturnGenericError</Name>
    </Step>
  </DefaultFaultRule>

  <HTTPTargetConnection>
    <URL>http://mocktarget.apigee.net</URL>
  </HTTPTargetConnection>
</TargetEndpoint>

The DefaultFaultRule is typically used to return a generic error message for any unexpected error, such as a message that contains contact information for technical support. This default response serves the dual purpose of providing developer-friendly information while also obfuscating backend URLs or other information that might be used to compromise the system.

For example, you define the following AssignMessage policy to return a generic error:

<AssignMessage name="ReturnGenericError">
  <Set>
    <Payload type="text/plain">SERVICE UNAVAILABLE. PLEASE CONTACT SUPPORT: support@company.com.</Payload>
  </Set>
</AssignMessage>

Include the <AlwaysEnforce> element in the <DefaultFaultRule> tag to execute the DefaultFaultRule for every error, even if another FaultRule has already been executed. The DefaultFaultRule is always the last FaultRule to execute:

  <DefaultFaultRule name="fault-rule">
    <Step>
      <Name>ReturnGenericError</Name>
    </Step>
    <AlwaysEnforce>true</AlwaysEnforce>
  </DefaultFaultRule>

One use of the DefaultFaultRule is to determine the type of error that occurs when you otherwise cannot determine it. For example, your API proxy is failing for an error that you cannot determine. Use the DefaultFaultRule to invoke the following AssignMessage policy. This policy writes the fault.name value to a header named DefaultFaultHeader in the response:

<AssignMessage async="false" continueOnError="false" enabled="true" name="DefaultFaultRule">
  <DisplayName>DefaultFaultRule</DisplayName>
  <Set>
    <Headers>
      <Header name="DefaultFaultHeader">{fault.name}</Header>
    </Headers>
  </Set>
  <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
  <AssignTo createNew="false" transport="http" type="response"/>
</AssignMessage>

You can then view the header in the Edge trace tool or on the response to see what caused the error.

Adding message logging to the PostClientFlow

The PostClientFlow is the only flow that executes after the proxy enters the error state. Only the MessageLogging policy can be attached to this flow, which is executed after the response is sent back to the client. Although attaching the MessageLogging policy to this flow is technically not error handling, you can use it to log information in the event of an error. Because it is executed regardless of whether the proxy succeeded or failed, you can put Message Logging policies in the PostClientFlow and be guaranteed that they always execute.

Handling policy faults within the current flow

The examples shown so far all use a FaultRule on the ProxyEndpoint or TargetEndpoint to handle any policy errors as part of the error state. That is because the default value of the continueOnError element of a policy is false, meaning that when an error occurs in a policy, control is directed to the error state. Once in the error state, you cannot return control back to the normal pipeline and you typically return some form of error message to the calling app.

However, if you set the continueOnError element to true for a policy, control stays in the current flow and the next policy in the pipeline executes after the policy that caused the error. The advantage to handling the error in the current flow is that you might have a way to recover from the error to complete processing of the request.

Shown below is a VerifyAPIKey policy named verify-api-key with the continueOnError element set to true:

<VerifyAPIKey async="false" continueOnError="true" enabled="true" name="verify-api-key">
  <DisplayName>Verify API Key</DisplayName>
  <APIKey ref="request.queryparam.apikey"/>
</VerifyAPIKey>

If the API key is missing or invalid, then the VerifyAPIKey policy sets the oauthV2.verify-api-key.failed variable to true, but processing continues in the current flow.

You then add VerifyAPIKey policy as a step in the PreFlow of the ProxyEndpoint:

<ProxyEndpoint name="default">
  ...
  <PreFlow name="PreFlow">
    <Request>
      <Step>
        <Name>verify-api-key</Name>
      </Step>
      <Step>
        <Name>FaultInFlow</Name>
        <Condition>(oauthV2.verify-api-key.failed = "true")</Condition>
      </Step>
    </Request>
    <Response/>
  </PreFlow>      
</ProxyEndpoint>  

Notice how the next step in the PreFlow uses a condition to test for the existence of an error. If an error occurred in the VerifAPIKey policy, then the policy named FaultInFlow policy executes. Otherwise, the FaultInFlow policy is skipped. The FaultInFlow policy can do many things, such as logging the error, attempting to fix the error, or performing some other action.

Triggering an error by using the RaiseFault policy

You can use the RaiseFault policy at any time in a flow to trigger an error. When a RaiseFault policy executes, it terminates the current flow and transfers control to the error state.

One use of the RaiseFault policy is to test for a specific condition that another policy might not detect. In the example above, you added a <Condition> tag to a PreFlow <Step> tag that caused the policy FaultInFlow to execute if the condition is met. If FaultInFlow is a RaiseFault policy, then control transfers to the error state. Or, you might insert a RaiseFault policy in a flow to debug and test your FaultRules.

When a RaiseFault policy triggers an error, you can use the following FaultRule and condition to process it:

<FaultRule name="raisefault_rule">
  <Step>
    <Name>{policy_name}</Name>
  </Step>
  <Condition>(fault.name = "RaiseFault")</Condition>
</FaultRule>

Note that the condition tests for a fault named RaiseFault. The RaiseFault policy always sets the value of fault.name to RaiseFault.

Custom handling of HTTP error codes from the target server

The examples shown in the previous sections apply to errors created by policies. However you can also create a custom response for transport-level errors, meaning HTTP errors returned from the target server. To control the response from an HTTP error, configure a TargetEndpoint to process HTTP response codes.

By default, Edge treats HTTP response codes in the 1xx-3xx range as 'success', and HTTP response codes in the range 4xx-5xx as 'failure'. That means any response from the backend service with an HTTP response code 4xx-5xx automatically invokes the error state, which then returns an error message directly to the requesting client.

You can create custom handlers for any HTTP response codes. For example, you might not want to treat all HTTP response codes in the range 4xx-5xx as 'failure' but only 5xx, or you might want to return custom error messages for HTTP response codes 400 and 500.

In the next example, you use the success.codes property to configure the TargetEndpoint to treat HTTP response codes 400 and 500 as a success, along with the default HTTP codes. By treating those codes as a success, the TargetEndpoint takes over the processing of the response message, instead of invoking the error state:

<TargetEndpoint name="default">
  ...
  <HTTPTargetConnection>
    <Properties>
          <Property name="success.codes">1xx,2xx,3xx,400,500</Property>
    </Properties>
    <URL>http://weather.yahooapis.com</URL>
  </HTTPTargetConnection>
</TargetEndpoint>

As you can see in this example, you can use wildcards to set the success.codes property to a range of values..

Setting the success.codes property overwrites the default values. Therefore, if you want to add HTTP code 400 to the list of default success codes, set this property as:

<Property name="success.codes">1xx,2xx,3xx,400</Property>

But, if you only want HTTP code 400 to be treated as a success code, set the property as:

<Property name="success.codes">400</Property>

You can now define custom handlers for HTTP response codes 400 and 500 to return a customized response message to the requesting app. The following TargetEndpoint uses the policy named ReturnError to handle HTTP 400 and 500 response codes:

<TargetEndpoint name="default">
  <PreFlow name="PreFlow">
    <Request/>
    <Response>
      <Step>
        <Name>ReturnError</Name>
        <Condition>(response.status.code = 400) or (response.status.code = 500)</Condition>
      </Step>
    </Response>
  </PreFlow>

  <HTTPTargetConnection>
    <Properties>
      <Property name="success.codes">1xx,2xx,3xx,400,500</Property>
    </Properties>
    <URL>http://weather.yahooapis.com</URL>
  </HTTPTargetConnection>
</TargetEndpoint>

This TargetEndpoint configuration causes the policy called ReturnError to handle the response whenever the TargetEndpoint encounters an HTTP response code of 400 or 500.

Fault taxonomy

API Services organizes faults into the following categories and subcategories.

Category Subcategory Fault Name Description
Messaging Failures that occur during the message flow (not including policy failures)
Custom faults {fault_name} Any faults explicitly handled by the API proxy using the RaiseFault policy
Response codes InternalServerError, NotFound HTTP error codes 5xx, 4xx
Routing failures NoRoutesMatched Failure in selecting a named TargetEndpoint for a request
Classification failures NotFound Failures caused by a request URI that does not match any BasePath for any ProxyEndpoint configurations (that is, no API proxies match the URL in the client app's request)
Transport HTTP transport-level errors
Connectivity ConnectionRefused, ConnectionReset, ConnectionTimeout Failures occur while establishing network or transport-level connections
Request validations ContentLengthMissing, HostHeaderMissing Faults occur during semantics checks on every request
Response validations Faults occur during semantics checks on every response
IO errors SSLHandshakeError, ReadTimeout, ReadError, WriteTimeout, WriteError, ChunkError Read/write errors at client or target endpoints, timeouts, TLS/SSL errors, and chunked errors
System Undefined runtime errors
Memory OutOfMemory, GCOverLimit Memory-related failures
Thread RogueTaskTerminated Failures such as termination of run-away tasks
Policy Faults for each Policy type are defined in the Policy Reference.

An error is always accompanied by a text description of the reason for the failure. When the system raises a fault, a set of attributes are populated to assist in troubleshooting. A fault includes the following information:

  • Reason
  • User-defined custom attributes