503 Service Unavailable - Premature closure by backend server

You're viewing Apigee Edge documentation.
Go to the Apigee X documentation.
info

Symptom

The client application gets an HTTP response status 503 with the message Service Unavailable following an API proxy call.

Error message

The client application gets the following response code:

HTTP/1.1 503 Service Unavailable

In addition, you may observe the following error message:

{
   "fault": {
      "faultstring": "The Service is temporarily unavailable",
      "detail": {
           "errorcode": "messaging.adaptors.http.flow.ServiceUnavailable"
       }
    }
}

Possible Causes

Cause Description Troubleshooting instructions applicable for
Target server prematurely closes connection The target server prematurely ends the connection while the Message Processor is still sending the request payload. Edge Public and Private Cloud users

Common diagnosis steps

Determine the Message ID of the failing request

Trace tool

To determine the message ID of the failing request using the Trace tool:

  1. If the issue is still active, enable the trace session for the affected API.
  2. Make the API call and reproduce the issue - 503 Service Unavailable with error code messaging.adaptors.http.flow.ServiceUnavailable.
  3. Select one of the failing requests.
  4. Navigate to the AX phase, and determine the message ID (X-Apigee.Message-ID) of the request by scrolling down in the Phase Details section as shown in the following figure.

    Message ID in Phase Details section

NGINX access logs

To determine the message ID of the failing request using the NGINX access logs:

You can also refer to NGINX Access logs to determine the message ID for the 503 errors. This is particularly useful if the issue has occurred in the past or if the issue is intermittent and you are unable to capture the trace in the UI. Use the following steps to determine this information from NGINX access logs:

  1. Check the NGINX access logs: (/opt/apigee/var/log/edge-router/nginx/ORG~ENV.PORT#_access_log)
  2. Search to see if there are any 503 Errors for the specific API proxy during a specific duration (if the problem happened in the past) or if there are any requests still failing with 503.
  3. If there are any 503 Errors with X-Apigee-fault-code messaging.adaptors.http.flow.ServiceUnavailable, note the message ID for one or more such requests as shown in the following example:

    Sample Entry showing the 503 Error

    Sample entry showing status code, message ID, fault source, and fault code

Cause: Target server prematurely closes connection

Diagnosis

  1. If you are a Public Cloud or Private Cloud user:
    1. Use the Trace tool (as explained in Common diagnosis steps) and verify that you have both of the following set in the Analytics Data Recorded pane:
      • X-Apigee.fault-code: messaging.adaptors.http.flow.ServiceUnavailable
      • X-Apigee.fault-source: target

      alt_text

    2. Use the Trace tool (as explained in Common diagnosis steps) and verify that you have both of the following set in the Error pane immediately after the TARGET_REQ_FLOW state property:
      • error.class: com.apigee.errors.http.server.ServiceUnavailableException
      • error.cause: Broken pipe

      alt_text

    3. Go to Using tcpdump for further investigation.
  2. If you are a Private Cloud user:
    • Determine the message ID of the failing request.
    • Search for the message ID in the Message Processor log (/opt/apigee/var/log/edge-message-processor/logs/system.log).
    • You will see one of the following exceptions:

      Exception #1: java.io.IOException: Broken pipe occurred while writing to channel ClientOutputChannel

      2021-01-30 15:31:14,693 org:anotherorg env:prod api:myproxy
      rev:1 messageid:myorg-opdk-test-1-30312-13747-1  NIOThread@1
      INFO  HTTP.SERVICE - ExceptionHandler.handleException() :
      Exception java.io.IOException: Broken pipe occurred while writing to channel
      ClientOutputChannel(ClientChannel[Connected:
      Remote:IP:PORT Local:0.0.0.0:42828]@8380 useCount=1
      bytesRead=0 bytesWritten=76295 age=2012ms  lastIO=2ms  isOpen=false)

      or

      Exception #2: onExceptionWrite exception: {}
      java.io.IOException: Broken pipe

      2021-01-31 15:29:37,438 org:anotherorg env:prod api:503-test
      rev:1 messageid:leonyoung-opdk-test-1-18604-13978-1
      NIOThread@0 ERROR HTTP.CLIENT - HTTPClient$Context$2.onException() :
      ClientChannel[Connected: Remote:IP:PORT
      Local:0.0.0.0:57880]@8569 useCount=1 bytesRead=0 bytesWritten=76295 age=3180ms  lastIO=2
      ms  isOpen=false.onExceptionWrite exception: {}
      java.io.IOException: Broken pipe
    • Both of these exceptions indicate that while the Message Processor was still writing the request payload to the backend server, the connection was prematurely closed by the backend server. Hence, the Message Processor throws the exception java.io.IOException: Broken pipe.
    • The Remote:IP:PORT indicates the resolved backend server IP address and port number.
    • The attribute bytesWritten=76295 in the above error message indicates that the Message Processor had sent a payload of 76295 bytes to the backend server when the connection was closed prematurely.
    • The attribute bytesRead=0 indicates that the Message Processor has not received any data (response) from the backend server.
    • To investigate this issue further, gather a tcpdump either on the backend server or Message Processor and analyze it as explained below.

Using tcpdump

  1. Capture a tcpdump on either the backend server or the Message Processor with the following commands:

    Command to gather tcpdump on the backend server:

    tcpdump -i any -s 0 host MP_IP_ADDRESS -w FILE_NAME
    

    Command to gather tcpdump on the Message Processor:

    tcpdump -i any -s 0 host BACKEND_HOSTNAME -w FILE_NAME
    
  2. Analyze the tcpdump captured:

    Sample tcpdump output (gathered on the Message Processor):

    alt_text

    In the above tcpdump, you can see the following:

    1. In packet 4, the Message Processor sent a POST request to the backend server.
    2. In packet 5, 8, 9, 10, 11, the Message Processor continued to send the request payload to the backend server.
    3. In packet 6 and 7,the backend server responded with ACK for a part of the request payload received from the Message Processor.
    4. However, in packet 12, instead of responding with an ACK for the received application data packets and subsequently responding with the response payload, the backend server instead responds with a FIN ACK initiating the closure of the connection.
    5. This clearly shows that the backend server is closing the connection prematurely while the Message Processor was still sending the request payload.
    6. This causes the Message Processor to record an IOException: Broken Pipe error and return a 503 to the client

Resolution

  1. Work with either or both your application and networking teams to analyse and fix the issue with the premature disconnections on the backend server side.
  2. Ensure that the backend server application is not timing out or resetting the connection before receiving the entire request payload.
  3. If you have any intermediary networking device or layer between Apigee and backend server, then ensure that they are not timing out before the entire request payload is received.

If the problem still persists, go to Must gather diagnostic information.

Must gather diagnostic information

If the problem persists even after following the above instructions, gather the following diagnostic information and then contact Apigee Edge Support:

If you are a Public Cloud user, provide the following information:

  • Organization name
  • Environment name
  • API Proxy name
  • Complete curl command to reproduce the 503 error
  • Trace file containing the request with the 503 Service Unavailable error
  • If the 503 errors are not occurring currently, provide the time period with the timezone information when 503 errors occurred in the past.

If you are a Private Cloud user, provide the following information:

  • Complete error message observed for the failing requests
  • Organization, Environment name and API Proxy name for which you are observing 503 errors
  • API Proxy bundle
  • Trace file containing the requests with 503 Service Unavailable error
  • NGINX access logs
    /opt/apigee/var/log/edge-router/nginx/ORG~ENV.PORT#_access_log
  • Message Processor logs
    /opt/apigee/var/log/edge-message-processor/logs/system.log
  • The time period with the timezone information when the 503 errors occurred
  • Tcpdumps gathered on the Message Processors and backend server when the error occurred