Enabling/disabling Message Processor/Router reachability

It is a good practice to disable reachability on a server during maintenance, such as for a server restart or upgrade. When reachability is disabled, no traffic is directed to the server. For example, when reachability is disabled on a Message Processor, Routers will not direct any traffic to that Message Processor.

For example, to upgrade a Message Processor, you can use the following procedure:

  1. Disable reachability on the Message Processor.
  2. Upgrade the Message Processor.
  3. Enable reachability on the Message Processor.

Disabling/enabling reachability on a Message Processor

To disable reachability on Message Processor, you can just stop the Message Processor:

/opt/apigee/apigee-service/bin/apigee-service edge-message-processor stop

The Message Processor first processes any pending messages before it shuts down. Any new requests are routed to other available Message Processors.

To restart the Message Processor, use the following commands:

/opt/apigee/apigee-service/bin/apigee-service edge-message-processor start
/opt/apigee/apigee-service/bin/apigee-service edge-message-processor wait_for_ready

The wait_for_ready command returns the following message when the Message Processor is ready to process messages:

Checking if message-processor is up: message-processor is up.

Disabling/enabling reachability on a Router

In a production environment, you typically have a load balancer in front of the Edge Routers. Load balancers monitor port 15999 on the Routers to ensure that the Route is available.

Configure the load balancer to perform an HTTP or TCP health check on the Router using the following URL:

http://router_IP:15999/v1/servers/self/reachable

This URL returns an HTTP 200 response code if the Router is reachable.

To make a Router unreachable, you can block port 15999 on the Router. If the load balancer is unable to access the Router on port 15999 it no longer forwards requests to the Router. For example, you can block the port by using the following iptables command on the Router node:

sudo iptables -A INPUT -i eth0 -p tcp --dport 15999 -j REJECT

To later make the Router available, flush iptables:

sudo iptables -F

You might be using iptables to manage other ports on the node so you have to take that into consideration when you flush iptables or use iptables to block port 15999. If you are using iptables for other rules, you can use the -D option to reverse the specific change:

sudo iptables -D INPUT -i eth0 -p tcp --dport 15999 -j REJECT

Perform Router health checks

You can perform the following types of health checks on Routers:

  • Liveness: A signal to the monitoring subsystem that it can restart a component. For example:
    To check a router's liveness:
    http://router_IP:8081/v1/servers/self/up
    
    To check a load balancer's liveness:
    http://router_IP:15999/v1/servers/self/reachable
  • Readyness: Determines if a router can process customer requests for a particular environment.

    For example:

    To check both a router and MP pool's availability:
    http://router_IP:15999/{org}__{env}
    You are not supposed to mix them, otherwise you are going to lose requests between a component is up and is ready for processing. For MP example, there is a time between a server has started and all proxy definitions are instantiated. Until MP is ready, you are not supposed to add it to R's Load balancers. TODO: REMOVE 15999 from port requirements for Router and re-add for nginx (https://docs.apigee.com/private-cloud/v4.18.05/port-requirements) * : * the real readyness check is currently... absent and that's why some customers are losing their legitimate requests when for example all MPs of one DC are down in the planet. It should correspond to a state of R when it is ready to serve requests, of course, which has a number of preconditions most important of which is: at least one MP is ready. (cascades to MP liveness and readyness probes). [@Nicola Cardace is it true that it used to work, but not anymore?]

    To get the status of a Router, make a request to port 8081 on the Router:

    curl -v http://router_IP:8081/v1/servers/self/up

    If the Router is up, the request returns "true" in the response and HTTP 200. Note that this call only checks if the Router is up and running. Control of the Router's reachability from a load balancer is determined by port 15999.

    To get the status of a Message Processor:

    curl http://Message_Processor_IP:8082/v1/servers/self/up