Apache Cassandra maintenance tasks

This section describes periodic maintenance tasks for Cassandra.

Anti-entropy maintenance

The Apache Cassandra ring nodes require periodic maintenance to ensure consistency across all nodes. To perform this maintenance, use the following command:

apigee-service apigee-cassandra apigee_repair -pr

Apigee recommends the following when running this command:

  • Run on every Cassandra node (across all regions or data centers).
  • Run on one node at a time, to ensure consistency across all nodes in the ring. Running repair jobs on multiple nodes at the same time can impair the health of Cassandra.

    To check if a repair job on a node has completed successfully, look in the nodes's system.log file for an entry with the latest repair session's UUID and the phrase "session completed successfully." Here is a sample log entry:

    INFO [AntiEntropySessions:1] 2015-03-01 10:02:56,245 RepairSession.java (line 282) [repair #2e7009b0-c03d-11e4-9012-99a64119c9d8] session completed successfully"
    Ref: https://support.datastax.com/hc/en-us/articles/204226329-How-to-check-if-a-scheduled-nodetool-repair-ran-successfully
  • Run during periods of relatively low workload (the tool imposes a significant load on the system).
  • Run at least every seven days in order to eliminate problems related to Cassandra's "forgotten deletes".
  • Run on different nodes on different days, or schedule it so that there are several hours between running it on each node.
  • Use the -pr option (partitioner range) to specify the primary partitioner range of the node only.

If you enabled JMX authentication for Cassandra, you must include the username and password when you invoke nodetool. For example:

apigee-service apigee-cassandra apigee_repair -u username -pw password -pr

You can also run the following command to check the supported options of apigee_repair:

apigee-service apigee-cassandra apigee_repair -h

Note: apigee_repair is a wrapper around Cassandra's nodetool repair, which performs additional checks before performing Cassandra's repair.

For more information, see the following resources:

Log file maintenance

Cassandra logs are stored in the /opt/apigee/var/log/cassandra directory on each node. By default, a maximum of 50 log files, each with a maximum size of 20 MB, can be created; once this limit is reached older logs are deleted when newer logs are created.

If you should find that Cassandra log files are taking up excessive space, you can modify the amount of space allocated for log files by editing the log4j settings.

  1. Edit /opt/apigee/customer/application/cassandra.properties to set the following properties. If that file does not exist, create it:
    # max file size
    conf_logback_maxbackupindex=50 # max open files
  2. Restart Cassandra by using the following command:
    /opt/apigee/apigee-service/bin/apigee-service apigee-cassandra restart

Disk space maintenance

You should monitor Cassandra disk utilization regularly to ensure at least 50 percent of each disk is free. If disk utilization climbs above 50 percent, we recommend that you add more disk space to reduce the percentage that is in use.

Cassandra automatically performs the following operations to reduce its own disk utilization:

  • Authentication token deletion when tokens expire. However, it may take a couple of weeks to free up the disk space the tokens were using, depending on your configuration. If automatic deletion is not adequate to maintain sufficient disk space, contact support to learn about manually deleting tokens to recover space.
  • Data compaction. We recommend changing the compaction strategy on keyspaces to LeveledCompactionStrategy, which offers better disk utilization strategies than the default SizeTieredCompactionStrategy. See Leveled Compaction Strategy.

Note: When Cassandra performs data compactions, it can take a considerable amount of CPU cycles and memory. But resource utilization should return to normal once compactions are complete. You can run the 'Nodetool compactionstats' command on each node to check if compaction is running. The output of compactionstats informs you if there are pending compactions to be executed and the estimated time for completion.