503 Service Unavailable

Symptom

The client application receives an HTTP response status 503 with the message Service Unavailable following an API proxy call.

Error messages

You can see one of the following error messages:

HTTP/1.1 503 Service Unavailable
HTTP/1.1 503 Service Unavailable: Back-end server is at capacity

You can also see this error message in the HTTP response:

{
   "fault": {
      "faultstring": "The Service is temporarily unavailable", 
      "detail": {
           "errorcode": "messaging.adaptors.http.flow.ServiceUnavailable"
       }
    }
}

Possible causes

The HTTP status code 503 means that the server is currently unavailable. On Apigee Edge, this problem can occur either at the incoming (northbound) or outgoing (southbound) connection. Most often, the error occurs because a server is too busy or is down for some reason, such as for temporarily maintenance. It can also occur if the TLS/SSL handshake fails between the client and the server.

Possible causes for the 503 Service Unavailable response are:

Cause Description Who can perform the troubleshooting steps
Overloaded Server The server is overloaded and cannot handle any new incoming client requests. Private and Public Cloud users
Connection Errors Network or connectivity issues prevent the client from connecting to the server. Private Cloud users only
SSL Handshake Failures
The TLS/SSL handshake failed between the client and server. (Troubleshooting for this class of problem is covered in a separate topic.)
Private and Public Cloud users

Overloaded server

The following error can occur when the server is overloaded or cannot handle any more requests:

HTTP/1.1 503 Service Unavailable: Back-end server is at capacity

Diagnosis

To diagose this issue, try to determine if the error occurs on the incoming (northbound) or outgoing (southbound) connection. To learn how to make this determination, see Determining the source of the problem.

If the error is on the incoming (northbound) connection:

  • Private Cloud users: Check if the Average Load/CPU/Memory usage is high on the Edge Router.
  • Public Cloud users: You do not have access to the Edge Router. Contact Apigee Support for assistance.

If the error is on the outgoing (southbound) connection:

  • All users: Check if the Average Load/CPU/Memory usage is high on the backend server.

Resolution

If the Edge Router is overloaded:

  • Private Cloud users: Restart the Edge Router and then monitor its usage to see if the problem is resolved. If the problem persists, contact Apigee Support for assistance.
  • Public Cloud users: You do not have access to the Edge Router. Contact Apigee support for assistance.

If the backend service is overloaded:

  • All users: Restart the appropriate backend server and then monitor it to see if the problem is resolved.
  • All users: If the problem persists, check if you need to increase the capacity of the appropriate backend server(s) and/or fix any issue with the backend server(s).

Were these troubleshooting steps helpful? Please send feedback to let us know.

Connection errors

A connection error happens when an Apigee Edge Message Processor attempts to connect to a backend server and one of these problems occurs:

  • The Message Processor is unable to connect within the preset connection timeout period. (Default: 3 seconds)
  • or
  • The backend server refuses the connection.

Diagnosis

  1. Check the Message Processor log (/opt/apigee/var/log/edge-message-processor/logs/system.log) for any of the following errors:
    1. An onConnectTimeout error indicates that the Message Processor was unable to connect to the backend server within the preset connection timeout period.
      2016-06-23 09:11:49,314 org:myorg env:prod api:Employees rev:1 messageid:mo-96cf6757a-9401-21-1 NIOThread@2 ERROR HTTP.CLIENT - HTTPClient$Context.onTimeout() : ClientChannel[C:]@10 useCount=1 bytesRead=0 bytesWritten=0 age=3001ms lastIO=3001ms .onConnectTimeout connectAddress=www.abc.com/11.11.11.11:80 resolvedAddress=www.abc.com/11.11.11.11 
      2016-06-23 09:11:49,333 org:myorg env:prod api:Employees rev:1 messageid:mo-96cf6757a-9401-21-1 NIOThread@2 ERROR ADAPTORS.HTTP.FLOW - RequestWriteListener.onTimeout() : RequestWriteListener.onTimeout(HTTPRequest@6b393600)
      
    2. A java.net.ConnectException: Connection refused error indicates the connection was refused by the backend server.
      14:40:16.531 +0530      
      2016-06-17 09:10:16,531 org:myorg env:prod api:www.abc.com rev:1 rrt07eadn-22739-40983870-15 NIOThread@2 ERROR HTTP.CLIENT - HTTPClient$Context.onConnectFailure() : connect to www.abc.com:11.11.11.11:443 failed with exception {} 
      java.net.ConnectException: Connection refused 
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.7.0_75] 
      at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) ~[na:1.7.0_75] 
      at com.apigee.nio.ClientChannel.finishConnect(ClientChannel.java:121) ~[nio-1.0.0.jar:na] 
      at com.apigee.nio.handlers.NIOThread.run(NIOThread.java:108) ~[nio-1.0.0.jar:na]
      
  2. Check if you are able to connect to the specific backend server directly from each of the Message Processors using the telnet command:
    1. If the backend server resolves to a single IP address, then use the following command:
      telnet BackendServer-IPaddress 443
      
    2. If the backend server resolves to multiple IP addresses, then use the hostname of the backend server in the telnet command as shown below:
      telnet BackendServer-HostName 443
      
  3. If you are able to connect to the backend server, then you might see a message like Connected to backend-server. If you are unable to connect to the backend server, this might be because the Message Processors' IP addresses are not whitelisted on the specific backend server.

Resolution

Whitelist the Message Processors' IP addresses on the specific backend server to allow traffic from the Edge Message Processors to your backend server. For example, On Linux, you could use iptables to whitelist or allow the traffic from the Message Processor's IP addresses on the backend server.

If the problem persists, work with your Network administrator to determine and fix the issue. If you need any further assistance from Apigee, contact Apigee Support.

Were these troubleshooting steps helpful? Please send feedback to let us know.

SSL Handshake Failures

An entire troubleshooting playbook is devoted to TLS/SSL handshake errors. See SSL Handshake Failures.

Determining the source of the problem

Certain types of errors can occur either on the incoming (northbound) or outgoing (southbound) connection. An incoming (northbound) error occurs between the client application and Edge. An outgoing (southbound) error occurs between Edge and the backend target server. To diagnose these kinds of problems, your first job is to figure out whether the error occurs on the northbound or southbound connection.

Understanding northbound and southbound connections

In Edge, you can encounter a 503 Service Unavailable error on either the incoming or outgoing connection:

  • Incoming (or northbound) connection - The connection between the client application and the Edge Router. The Router is the component of Apigee Edge that handles incoming requests made to the system.
  • Outgoing (or southbound) connection - The connection between the Edge Message Processor and the backend server. The Message Processor is a component of Apigee Edge that proxies API requests to backend target servers.

If you are an Edge Public Cloud user, you are probably unaware of internal components such as the Router or the Message Processor. These internal components are not visible or accessible to Public Cloud users. Where possible, we provide alternative ways to investigate the problem that do not require direct access to these components.

The following figure illustrates northbound and southbound connections for Apigee Edge.

Determining where the 503 Service Unavailable error occurred

Use one of the following procedures to determine if the 503 Service Unavailable error occurred at the northbound or southbound connection.

Procedure 1: Using UI Trace (For all users)

This procedure can be performed by Public or Private Cloud users:

  1. If the issue is still active, enable the UI trace for the affected API.
  2. If the UI trace for the failing API request shows that the 503 Service Unavailable error occurs during the target request flow or is sent by the backend server, then the issue is southbound (that is, between the Message Processor and the backend server).
  3. If you don't get the trace for the specific API call, then the issue is northbound, between the client application and the Router.

Procedure 2: Using API Monitoring (For Apigee Cloud users only)

If you are a Private Cloud user, skip this procedure.

API Monitoring enables you to isolate problem areas quickly to diagnose error, performance, and latency issues and their source, such as developer apps, API proxies, backend targets, or the API platform.

Step through a sample scenario that demonstrates how to troubleshoot 5xx issues with your APIs using API Monitoring. For example, you may want to set up an alert to be notified when the number of messaging.adaptors.http.flow.ServiceUnavailable faults exceeds a particular threshold.

Procedure 3: Using Nginx Access Logs (For Apigee Private Cloud users only)

If you are a Public Cloud user, skip this procedure.

If the issue has happened in the past or if the issue is intermittent and you are unable to capture the trace, then perform the following steps:

  1. Check the Nginx access logs (/opt/apigee/var/log/edge-router/nginx/ org-env.port_access_log ).
  2. Search if there are any 503 Errors for specific API proxy.
  3. If you can identify any 503 Errors for the specific API at the specific time, then the issue occurred at the southbound connection (between the Message Processor and the backend server).
  4. If not, then the issue occurred at the northbound connection (between the client application and the Router).