Zookeeper Data Issues

Symptom

Data related issues, commonly referred to as wiring issues, can manifest as one of the following symptoms:

  • Failures during startup of Management servers
  • Deployment failures
  • Datastore errors on the UI
  • Cross data center connectivity issues among Message Processors and Management servers
  • Analytics showing no data

These issues are not related to the ZooKeeper infrastructure, but related to invalid data that is available in the ZooKeeper tree.

Possible causes

The typical causes for this issue are:

  1. Nodes wired to the wrong region or pod name during installation due to mistakes in the silent installation file.
  2. A failed installation of a component creates duplicate registrations when you reinstall the component multiple times. In this case, cleanup is required to remove the registration with the wrong UUIDs.

Diagnosis

To diagnose, gather the following data:

  1. Topology diagram, with hostname and ip addresses of each node and what Apigee component exist on the node. A mapping like the following using the profile of the Apigee install would be most helpful:
    DC-1
    DS: ip1 hostname
    DS: ip2 hostname
    DS: ip3 hostname
    MS: ip4 hostname
    RMP: ip5 hostname
    RMP: ip6 hostname
    SAX: ip7 hostname
    
    DC-2
    DS: ip8 hostname
    DS: ip9 hostname
    DS: ip10 hostname
    MS: ip11 hostname
    RMP: ip12 hostname
    RMP: ip13 hostname
    SAX: ip14 hostname
    
  2. Generate ZooKeeper tree output to check the wiring:
    /opt/apigee/apigee-ZooKeeper/contrib/zk-tree.sh > zk-tree-output.txt
    
  3. For ease of verification of the data in ZooKeeper tree, run the following management API calls to get the list of server UUIDs in each of data centers:

    Gateway Servers

    curl -u sysadmin@email.com "http://management-server-host:8080/v1/servers?pod=gateway&region=region-name"
    

    Central Servers

    curl -u sysadmin@email.com "http://management-server-host:8080/v1/servers?pod=central&region=region-name"
    

    Analytics Servers

    curl -u sysadmin@email.com "http://management-server-host:8080/v1/servers?pod=analytics&region=region-name"
    
  4. Check the UUIDs on each component and make sure they match what you see in the ZooKeeper tree:

    Router

    curl 0:8081/v1/servers/self/uuid
    

    Message Processor

    curl 0:8082/v1/servers/self/uuid
    

    Qpid Agent

    curl 0:8083/v1/servers/self/uuid
    

    Postgres Agent

    curl 0:8084/v1/servers/self/uuid
    
  5. Use the UUID data to search the ZooKeeper tree output generated in step #2 to validate the components wiring and to remove any duplicate registrations for the component that have the wrong UUIDs.
  6. Use the management API calls listed here for correcting datastore registration. The components like Routers, Message Processors, Postgres, and Qpid self register to ZooKeeper during startup time.

Resolution

ZooKeeper data-related problems need to be addressed on a case-by-case basis. Data in ZooKeeper is based on Apigee Edge topologies and vary by each use case. If one of the problem symptoms is being experienced, collect the data as explained in the previous section and contact Apigee Support.