Minimum cluster configurations

This topic describes minimum cluster configurations for Apigee hybrid. These minimum configurations apply to all of the supported Kubernetes platforms. The recommendations in this topic apply for non-production installations, such as trial or testing scenarios. Keep these recommendations in mind when performing the Apigee hybrid installation steps.

About node pools

A node pool is a group of nodes within a cluster that all have the same configuration. By default, hybrid assigns all pods to the default node pool; however, you can create dedicated node pools and assign hybrid components to them as a way of distributing resources.

Typically, you define dedicated node pools when you have pods with differing resource requirements. For example, the apigee-cassandra pods require persistent storage, while the other Apigee hybrid pods do not. For this reason, we recommend that you create a stateful node pool for Cassandra and a stateless node pool for the rest of the hybrid runtime services. See Configure dedicated node pools for details.

The following section lists configurations for both stateful and stateless node pools.

Minimum configurations

Use these minimum configurations when setting up your cluster:

Configuration Stateful node pool Stateless node pool
Purpose A stateful node pool used for the Cassandra database. A stateless node pool used by the runtime message processor.
Label name apigee-data apigee-runtime
Number of nodes 1 per zone (3 per region) 1 per zone (3 per region)
CPU 4 4
RAM 15 15
Storage dynamic Managed with the ApigeeDeployment CRD
Minimum disk IOPS 2000 IOPS with SAN or directly attached storage. NFS is not recommended even if it can support the required IOPS. 2000 IOPS with SAN or directly attached storage. NFS is not recommended even if it can support the required IOPS.
Network bandwidth for each machine instance type 1 Gbps 1 Gbps

Cassandra network requirements

This section discusses network requirements and recommendations to follow when setting up Apigee hybrid.

Network bandwidth

Cassandra uses the Gossip protocol to exchange information with other nodes about network topology. The use of Gossip plus the distributed nature of Cassandra—which involves talking to multiple nodes for read and write operations—results in a lot of data transfer through the network.

Cassandra requires a minimum of 1 Gbps of network bandwidth for each machine instance. For example, on GKE, the minimum recommended machine type, e2-standard-4, has a minimum bandwidth of 1 Gbps. For production installations, a higher Gbps is recommended.

The maximum or 99th percentile latency for Cassandra should be below 100 milliseconds.

Secure network connectivity between regions

When installing hybrid in multiple regions, ensure that the connections between regions is secure:

  • Use a virtual private network solution, such as Google Virtual Private Cloud (VPC), to secure connectivity between regions.
  • Open a firewall to ensure that Cassandra nodes can connect between regions in non-overlapping subnets and can resolve those network IPs.
  • Always use port 7001 for Cassandra. All other ports are local to the region. See also Secure ports usage.

Cassandra NTP requirements

Cassandra data synchronizes based on the timestamp of the system. Ensure that the time is synchronized across all pods and all regions within the Cassandra cluster. Time delays between the nodes and regions causes data inconsistencies.

Scaling the configuration

If you need to scale your initial configuration based on additional capacity or throughput needs, see the following topics: