About Cassandra replication factor and consistency level

About the Cassandra replication factor

Cassandra stores data replicas on multiple nodes to ensure reliability and fault tolerance. The replication strategy for each Edge keyspace determines the nodes where replicas are placed.

The total number of replicas for a keyspace across a Cassandra cluster is referred to as the keyspace's replication factor. A replication factor of one means that there is only one copy of each row in the Cassandra cluster. A replication factor of two means there are two copies of each row, where each copy is on a different node. All replicas are equally important; there is no primary or master replica.

In a production system with three or more Cassandra nodes in each data center, the default replication factor for an Edge keyspace is three. As a general rule, the replication factor should not exceed the number of Cassandra nodes in the cluster.

Use the following procedure to view the Cassandra schema, which shows the replication factor for each Edge keyspace:

  1. Log in to a Cassandra node.
  2. Run the following command:
    /opt/apigee/apigee-cassandra/bin/cqlsh $(hostname -i) [-u cassuser -p casspass] -e "select keyspace_name, replication from system_schema.keyspaces;"

    Where $(hostname -i) resolves to the IP address of the Cassandra node. Or you can replace $(hostname -i) with the IP address of the node.

    cassuser: If you have enabled Cassandra authentication, pass the Cassandra username. This is optional and can be skipped if you don’t have Cassandra authentication enabled.

    casspass: If you have enabled Cassandra authentication, pass the Cassandra password. This is optional and can be skipped if you don’t have Cassandra authentication enabled.

You will see output like the one below, where each row represents one keyspace:

  keyspace_name       | replication                                                                 
  kms                 | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3'}
  system_distributed  | {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}
  apprepo             | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3'}
  

You can see that for data center 1, dc-1, the default replication factor for the kms keyspace is three for an installation with three Cassandra nodes. For certain keyspaces internal to Cassandra (like system, system_schema, etc.), the replication strategy and replication factor may be different. This is intentional system behavior.

If you add additional Cassandra nodes to the cluster, the default replication factor is not affected.

About the Cassandra consistency level

The Cassandra consistency level is defined as the minimum number of Cassandra nodes that must acknowledge a read or write operation before the operation can be considered successful. Different consistency levels can be assigned to different Edge keyspaces.

When connecting to Cassandra for read and write operations, Message Processor and Management Server nodes typically use the Cassandra value of LOCAL_QUORUM to specify the consistency level for a keyspace. However, some keyspaces are defined to use a consistency level of one.

The calculation of the value of LOCAL_QUORUM for a data center is:

LOCAL_QUORUM = (replication_factor/2) + 1

As described above, the default replication factor for an Edge production environment with three Cassandra nodes is three. Therefore, the default value of LOCAL_QUORUM = (3/2) +1 = 2 (the value is rounded down to an integer).

With LOCAL_QUORUM = 2, at least two of the three Cassandra nodes in the data center must respond to a read/write operation for the operation to succeed. For a three node Cassandra cluster, the cluster could therefore tolerate one node being down per data center.

By specifying the consistency level as LOCAL_QUORUM, Edge avoids the latency required by validating operations across multiple data centers. If a keyspace used the Cassandra QUORUM value as the consistency level, read/write operations would have to be validated across all data centers.

To see the consistency level used by the Edge Message Processor or Management Server nodes:

  1. Log in to a Message Processor node.
  2. Change to the /opt/apigee/edge-message-processor/conf directory:
    cd /opt/apigee/edge-message-processor/conf
  3. For read and write consistency:
    grep -ri "consistency.level" *
  4. Log in to the Management Server node.
  5. Change to the /opt/apigee/edge-management-server/conf directory:
    cd /opt/apigee/edge-management-server/conf
  6. Repeat step 3.

If you add additional Cassandra nodes to the cluster, the consistency level is not affected.