If you encounter an error during an update to Edge 4.53.00, you can roll back the component that caused the error and then try the update again.
You can roll back Edge 4.53.00 to the following minor release version:
- Version 4.52.02
Rolling back a version involves rolling back every component that you may have upgraded. Additionally, you should take special considerations into account when rolling back Cassandra to version 4.52.02.
There are two scenarios where you might want to perform a rollback:
- Roll back to a previous major or minor release. For example from 4.53.00 to 4.52.02.
- Roll back to a previous patch release in the same release. For example, from 4.53.00.01 to 4.53.00.00.
For more information, see Apigee Edge release process.
Order of rollback
Rollback of components should be done in the reverse order they were upgraded, with the exception that management servers should be rolled back after Cassandra.
A typical general order of rollback for Private Cloud 4.53.00 will look like below:
- Rollback Postgres, Qpid, and other analytics-related components
- Rollback Routers and Message Processors
- Rollback Cassandra, Zookeeper
- Rollback Management server
For example, let’s say you had upgraded the entire Cassandra cluster, all your management servers, and a few RMPs to version 4.53.00 from version 4.52.02 and wish to rollback. In this case, you would:
- Rollback all RMPs one by one
- Rollback the entire Cassandra cluster using backups
- Rollback Edge Management server nodes one by one
Who can perform a rollback
The user performing a rollback should be the same as the user who originally updated Edge, or a user running as root.
By default, Edge components run as the user "apigee". In some cases, you might be running Edge components as different users. For example, if the Router has to access privileged ports, such as those below 1000, then you have to run the Router as root or as a user with access to those ports. Or, you might run one component as one user, and another component as another user.
Components with common code
The following Edge components share common code. Therefore, to roll back any one of these components on a node, you must roll back all of these components that are on that node.
edge-management-server
(Management Server)edge-message-processor
(Message Processor)edge-router
(Router)edge-postgres-server
(Postgres Server)edge-qpid-server
(Qpid Server)
For example, if you have the Management Server, Router, and Message Processor installed on the node, to roll back any one of them you must roll back all three.
Rollback of Cassandra
When a major upgrade of Cassandra is performed on a specific node, Cassandra modifies the schema of the data stored on that node. As a result, a direct in-place rollback is not feasible.
Rollback scenarios
Cassandra 4.0.X, available with Edge for Private Cloud 4.53.00, is compatible with other components of Private Cloud 4.52.02.
Please refer to the table below for a summary of the various rollback strategies you can use:
Scenario | Rollback strategy |
---|---|
Single DC, some Cassandra nodes upgraded | Use backups |
Single DC, all Cassandra nodes upgraded | Do not rollback Cassandra. Other components can be rolled back. |
Single DC, all nodes (Cassandra and others) upgraded | Do not rollback Cassandra. Other components can be rolled back. |
Multiple DC, some nodes in one DC upgraded | Rebuild from existing DC |
Multiple DC, all Cassandra nodes in some DCs upgraded | Rebuild from existing DC |
Multiple DC, Cassandra nodes of the last DC being upgraded | Try to finish the upgrade. If not feasible, rollback 1 DC using backup. Rebuild remaining DCs from the rolled-back DC. |
Multiple DC, all Cassandra nodes upgraded | Do not rollback Cassandra. Other components can be rolled back. |
Multiple DC, all nodes (Cassandra and others) upgraded | Do not rollback Cassandra. Other components can be rolled back. |
General considerations
When considering a rollback, keep the following in mind:
- Rollback of runtime or management components: If you want to rollback components like edge-management-server, edge-message-processor, or any non-Cassandra component to Private Cloud version 4.52.02, it is recommended that you do NOT rollback Cassandra. Cassandra shipped with Private Cloud 4.53.00 is compatible with all non-Cassandra components of Edge for Private Cloud 4.52.02. You can rollback non-Cassandra components using the methodology listed here while Cassandra remains on version 4.0.13.
- Rollback after the entire Cassandra cluster is upgraded to 4.0.X: If your entire Cassandra cluster is upgraded to version 4.0.X as part of the upgrade to Private Cloud version 4.53.00, it is recommended that you continue with this cluster setup and NOT rollback Cassandra. Components like edge-management-server, edge-message-processor, edge-router, etc., of Private Cloud version 4.52.02 are compatible with Cassandra version 4.0.X.
- Rollback of Cassandra during the Cassandra upgrade: If you encounter issues during the Cassandra upgrade, you may want to consider a rollback. The rollback strategies listed in this article can be followed based on the state you are in during the upgrade process.
- Rollback using backups: Backups taken from Cassandra 4.0.X are not compatible with backups of Cassandra 3.11.X. To rollback Cassandra using backup restoration, you must take backups of Cassandra 3.11.X before attempting the upgrade.
Rollback Cassandra using rebuild
Prerequisites
- You are operating an Edge for Private Cloud 4.52.02 cluster across multiple data centers.
- You are in the process of upgrading Cassandra from 3.11.X to 4.0.X and have encountered issues during the upgrade.
- You have at least one fully functional data center in the cluster still running the older version of Cassandra (Cassandra 3.11.X).
This procedure relies on streaming data from an existing data center. It could take a significant amount of time, depending on how much data is stored in Cassandra. You should be prepared to divert your runtime traffic away from this data center while the rollback is ongoing.
High-level steps
- Select one data center (either partially or fully upgraded) that you’d like to roll back. Divert runtime traffic to a different functioning data center.
- Identify the seed node in the data center and start with one of the seed nodes.
- Stop, uninstall, and clean up the Cassandra node.
- Install the older version of Cassandra on the node and configure it as needed.
- Remove the extra configurations that were added earlier.
- Repeat the above steps for all seed nodes in the data center, one by one.
- Repeat the above steps for all remaining Cassandra nodes in the data center, one by one.
- Rebuild the nodes from the existing functional data center, one by one.
- Restart all edge-* components in the data center that are connected to Cassandra.
- Test and divert traffic back to this data center.
- Repeat the steps for each data center, one by one.
Detailed steps
-
Pick one data center where all or some Cassandra nodes are upgraded. Divert all runtime proxy traffic and management traffic from this data center while the Cassandra nodes in this data center are being rolled back.
Ensure all Cassandra nodes are in the UN (Up/Normal) state when the
nodetool ring
command is executed on the nodes. If certain nodes are down, troubleshoot the issue and bring those nodes back up before continuing.See the example below:
/opt/apigee/apigee-cassandra/bin/nodetool status
Datacenter: dc-1 ================ Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN DC1-1IP1 456.41 KiB 1 100.0% 78fc4ddd-2ed9-4a8c-98a2-63a38c2f1920 ra-1 UN DC1-1IP2 870.93 KiB 1 100.0% 160db01a-64ab-43a7-b9ea-3b7f8f66d52b ra-1 UN DC1-1IP3 824.08 KiB 1 100.0% 21d61543-d59e-403a-bf5d-bfe7f664baa6 ra-1 Datacenter: dc-2 ================ Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN DC2-1IP1 802.08 KiB 1 100.0% 583e0576-336d-4ce7-9729-2ae74e0abde2 ra-1 UN DC2-1IP2 844.4 KiB 1 100.0% fef794d5-f4c2-4a4e-bb05-9adaeb4aea4b ra-1 UN DC2-1IP3 878.12 KiB 1 100.0% 3894b3d9-1f5a-444d-83db-7b1e338bbfc9 ra-1You can run
nodetool describecluster
on the nodes to understand the current state of the entire cluster. For example, the following shows an instance of a 2-data-center cluster where all DC-1 nodes are on Cassandra version 4, whereas all DC-2 nodes are on Cassandra version 3:# On nodes where Cassandra is upgraded
/opt/apigee/apigee-cassandra/bin/nodetool describecluster
Cluster Information: Name: Apigee Snitch: org.apache.cassandra.locator.PropertyFileSnitch DynamicEndPointSnitch: enabled Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 2eadcd74-0245-309a-9992-3625afa70038: [DC-1-IP1, DC-1-IP2, DC-1-IP3] 129dc15e-198e-3c11-b64c-701044a3a1ad: [DC-2-IP1, DC-2-IP2, DC-2-IP3] Stats for all nodes: Live: 6 Joining: 0 Moving: 0 Leaving: 0 Unreachable: 0 Data Centers: dc-1 #Nodes: 3 #Down: 0 dc-2 #Nodes: 3 #Down: 0 Database versions: 4.0.13: [DC-1-IP1:7000, DC-1-IP2:7000, DC-1-IP3:7000] 3.11.16: [DC-2-IP1:7000, DC-2-IP2:7000, DC-2-IP3:7000] Keyspaces: system_schema -> Replication class: LocalStrategy {} system -> Replication class: LocalStrategy {} auth -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} cache -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} devconnect -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} dek -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} user_settings -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} apprepo -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} kms -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} identityzone -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} audit -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} analytics -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} keyvaluemap -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} counter -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} apimodel_v2 -> Replication class: NetworkTopologyStrategy {dc-2=3, dc-1=3} system_distributed -> Replication class: SimpleStrategy {replication_factor=3} system_traces -> Replication class: SimpleStrategy {replication_factor=2} system_auth -> Replication class: SimpleStrategy {replication_factor=1} # On nodes where Cassandra is not upgraded/opt/apigee/apigee-cassandra/bin/nodetool describecluster
Cluster Information: Name: Apigee Snitch: org.apache.cassandra.locator.PropertyFileSnitch DynamicEndPointSnitch: enabled Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 2eadcd74-0245-309a-9992-3625afa70038: [DC-1-IP1, DC-1-IP2, DC-1-IP3] 129dc15e-198e-3c11-b64c-701044a3a1ad: [DC-2-IP1, DC-2-IP2, DC-2-IP3] - Identify the seed nodes in the data center: Refer to the section How to identify seed nodes in the Appendix. Execute the steps below on one of the seed nodes:
- Stop, uninstall, and clean up data from the node of Cassandra.
Pick the first seed node on Cassandra version 4 in this data center. Stop it.
# Stop Cassandra service on the node
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra stop
# Uninstall Cassandra software/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra uninstall
# Wipe out Cassandra datarm -rf /opt/apigee/data/apigee-cassandra
- Install the older Cassandra software on the node and set some configurations. Execute the bootstrap file of Edge for Private Cloud 4.52.02.
# Download bootstrap of 4.52.02curl https://software.apigee.com/bootstrap_4.52.02.sh -o /tmp/bootstrap_4.52.02.sh -u uName:pWord
# Execute bootstrap of 4.52.02sudo bash /tmp/bootstrap_4.52.02.sh apigeeuser=uName apigeepassword=pWord
Set Cassandra configs
- Create or edit the file
/opt/apigee/customer/application/cassandra.properties
. - Add the following contents to the file.
ipOfNode
is the IP address of the node that Cassandra uses to communicate with other Cassandra nodes:conf_jvm_options_custom_settings=-Dcassandra.replace_address=ipOfNode -Dcassandra.allow_unsafe_replace=true
- Ensure the file is owned and readable by the apigee user:
chown apigee:apigee /opt/apigee/customer/application/cassandra.properties
- Install and set up Cassandra:
- Install Cassandra version 3.11.X:
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra install
- Set up Cassandra by passing the standard configuration file:
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra setup -f configFile
- Ensure that Cassandra 3.11.X is installed and the service is running:
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra version
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra status
- Install Cassandra version 3.11.X:
- Verify that the node has started. Check the following command on this node and other nodes in the cluster. The node should report that it is in the "UN" (Up/Normal) state:
/opt/apigee/apigee-cassandra/bin/nodetool status
- Remove the extra configurations added earlier from the file
/opt/apigee/customer/application/cassandra.properties
. - Repeat steps 3 to 6 on all Cassandra seed nodes in the data center, one by one.
- Repeat steps 3 to 6 on all remaining Cassandra nodes in the data center, one by one.
- Rebuild all the nodes in the data center from a data center running the older Cassandra version. Perform this step one node at a time:
This procedure may take some time. You can adjust the/opt/apigee/apigee-cassandra/bin/nodetool rebuild -dc <name of working DC>
streamingthroughput
if necessary. Check the status using:/opt/apigee/apigee-cassandra/bin/nodetool netstats
- Restart all edge-* components in the data center, one by one:
/opt/apigee/apigee-service/bin/apigee-service edge-message-processor restart
/opt/apigee/apigee-service/bin/apigee-service edge-router restart
/opt/apigee/apigee-service/bin/apigee-service edge-management-server restart
/opt/apigee/apigee-service/bin/apigee-service edge-qpid-server restart
/opt/apigee/apigee-service/bin/apigee-service edge-postgres-server restart
- Validate and divert traffic back to this data center. Run some validations for runtime traffic and management APIs in this data center, and start rerouting proxy and management API traffic back to it.
- Repeat the above steps for each data center you want to roll back.
Rollback Cassandra using Backup
Prerequisites
- You are in the process of upgrading Cassandra from 3.11.X to 4.0.X and have encountered issues during the upgrade.
- You have backups for the node you are rolling back. The backup was taken before the upgrade from 3.11.X to 4.0.X was attempted.
Steps
Select one node you want to roll back. If you are rolling back all nodes in a data center using backups, start with the seed nodes first. Refer to the section "How to Identify Seed Nodes" in the Appendix.
Stop, uninstall, and clean up the Cassandra node:
# Stop Cassandra service on the node
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra stop
# Uninstall Cassandra software/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra uninstall
# Wipe Cassandra datarm -rf /opt/apigee/data/apigee-cassandra
Install the older Cassandra software on the node and configure it:
- Execute the bootstrap file for Edge for Private Cloud 4.52.02:
- Create or edit the file
/opt/apigee/customer/application/cassandra.properties
: - Ensure the file is owned by the apigee user and is readable:
- Install and set up Cassandra:
# Download bootstrap for 4.52.02
curl https://software.apigee.com/bootstrap_4.52.02.sh -o /tmp/bootstrap_4.52.02.sh -u ‘uName:pWord’
# Execute bootstrap for 4.52.02sudo bash /tmp/bootstrap_4.52.02.sh apigeeuser=uName apigeepassword=pWord
conf_jvm_options_custom_settings=-Dcassandra.replace_address=ipOfNode -Dcassandra.allow_unsafe_replace=true
chown apigee:apigee /opt/apigee/customer/application/cassandra.properties
# Install Cassandra version 3.11.X
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra install
# Set up Cassandra with the standard configuration file/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra setup -f configFile
# Verify Cassandra version and check service status/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra version
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra status
Verify that the node has started. Check the following command on this node and other nodes in the cluster. Nodes should report that this node is in the "UN" state:
/opt/apigee/apigee-cassandra/bin/nodetool status
Stop the Cassandra service and restore the backup. Refer to the backup and restore documentation for more details:
# Stop Cassandra service on the node
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra stop
# Wipe the data directory in preparation for restorerm -rf /opt/apigee/data/apigee-cassandra/data
# Restore the backup taken before the upgrade attempt/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra restore backupFile
Once the backup is restored, remove the additional configurations:
Remove the configuration added earlier from the file
/opt/apigee/customer/application/cassandra.properties
.Start the Cassandra service on the node:
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra start
Repeat the steps on each Cassandra node you wish to roll back using backups, one at a time.
Once all Cassandra nodes are restored, restart all edge-* components one by one:
/opt/apigee/apigee-service/bin/apigee-service edge-message-processor restart
/opt/apigee/apigee-service/bin/apigee-service edge-router restart
/opt/apigee/apigee-service/bin/apigee-service edge-management-server restart
/opt/apigee/apigee-service/bin/apigee-service edge-qpid-server restart
/opt/apigee/apigee-service/bin/apigee-service edge-postgres-server restart
Backup optimizations (advanced option)
You can potentially minimize (or eliminate) data loss while restoring backups if you have replicas available that contain the latest data. If replicas are available, after restoring the backup, run a repair on the node that was restored.
Appendix
How to identify seed nodes
On any Cassandra node in a data center, run the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-cassandra configure -search conf_cassandra_seeds
The command will output multiple lines. Look for the last line of the output. The IP addresses listed in the last line are the seed nodes. In the example below, DC-1-IP1
, DC-1-IP2
, DC-2-IP1
, and DC-2-IP2
are the seed node IPs:
Found key conf_cassandra_seeds, with value, "127.0.0.1", in /opt/apigee/apigee-cassandra/token/default.properties Found key conf_cassandra_seeds, with value, 127.0.0.1, in /opt/apigee/apigee-cassandra/token/application/cassandra.properties Found key conf_cassandra_seeds, with value, "DC-1-IP1, DC-1-IP2, DC-2-IP1, DC-2-IP2", in /opt/apigee/token/application/cassandra.properties apigee-configutil: apigee-cassandra: # OK
Roll back to a previous major or minor release
To roll back to a previous major or minor release, do the following on each node that hosts the component:
-
Download the
bootstrap.sh
file for the version to which you want to roll back:- To roll back to 4.52.02, download
bootstrap_4.52.02.sh
:curl https://software.apigee.com/bootstrap_4.52.02.sh -o /tmp/bootstrap_4.52.02.sh
- To roll back to 4.52.02, download
- Stop the component to roll back:
- To roll back any of the components with common code on the
node, you must stop them all, as the following example shows:
/opt/apigee/apigee-service/bin/apigee-service edge-management-server stop
/opt/apigee/apigee-service/bin/apigee-service edge-router stop
/opt/apigee/apigee-service/bin/apigee-service edge-message-processor stop
/opt/apigee/apigee-service/bin/apigee-service edge-qpid-server stop
/opt/apigee/apigee-service/bin/apigee-service edge-postgres-server stop
- To roll back any other component on the node, stop just that component:
/opt/apigee/apigee-service/bin/apigee-service component stop
- To roll back any of the components with common code on the
node, you must stop them all, as the following example shows:
- If you are rolling back Monetization, uninstall it from all Management Server and Message
Processor nodes:
/opt/apigee/apigee-service/bin/apigee-service edge-mint-gateway uninstall
- Uninstall the component to roll back on the node:
- To roll back any of the components with common code on the
node, you must uninstall them all by uninstalling the
edge-gateway
component group, as the following example shows:/opt/apigee/apigee-service/bin/apigee-service edge-gateway uninstall
- To roll back any other component on the node, uninstall just that component, as the
following example shows:
/opt/apigee/apigee-service/bin/apigee-service component uninstall
Where component is the component name.
- To roll back the Edge Router, you must delete the contents of the
/opt/nginx/conf.d
file in addition to uninstalling theedge-gateway
component group:cd /opt/nginx/conf.d
rm -rf *
- To roll back any of the components with common code on the
node, you must uninstall them all by uninstalling the
- Uninstall the 4.53.00 version of
apigee-setup
:/opt/apigee/apigee-service/bin/apigee-service apigee-setup uninstall
- Install the 4.52.02 version of the
apigee-service
utility and its dependencies. The following example installs the 4.52.02 version of theapigee-service
:sudo bash /tmp/bootstrap_4.52.02.sh apigeeuser=uName apigeepassword=pWord
Where uName and pWord are the username and password you received from Apigee. If you omit pWord, you will be prompted to enter it.
If you get an error, be sure you downloaded the
bootstrap.sh
file in step 1. - Install
apigee-setup
:/opt/apigee/apigee-service/bin/apigee-service apigee-setup install
- Install the older version of the component:
/opt/apigee/apigee-setup/bin/setup.sh -p component -f configFile
Where component is the component to install and configFile is your configuration file for the older version.
- If you are rolling back Qpid, flush iptables:
sudo iptables -F
- Repeat this process for each node that hosts the component you are rolling back.
Roll back to a previous patch release
To roll back a component to a specific patch release, do the following on each node that hosts the component:
- Download the specific component version:
/opt/apigee/apigee-service/bin/apigee-service component_version install
Where component_version is the component and patch release to install. For example:
/opt/apigee/apigee-service/bin/apigee-service edge-ui-4.53.00-0.0.20254 install
If you are using the Apigee online repo, you can determine the available component versions by using the following command:
yum --showduplicates list comp
For example:
yum --showduplicates list edge-ui
- Use
apigee-setup
to install the component:/opt/apigee/apigee-setup/bin/setup.sh -p comp -f configFile
For example:
/opt/apigee/apigee-setup/bin/setup.sh -p ui -f configFile
Note that you specify only the component name when you install it, not the version.
- Repeat this process for each node that hosts the component you are rolling back.
Roll back mTLS
To roll back the mTLS update, do the following steps on all hosts:
- Stop Apigee:
apigee-all stop
- Stop mTLS:
apigee-service apigee-mtls uninstall
- Reinstall mTLS:
apigee-service apigee-mtls install
apigee-service apigee-mtls setup -f /opt/silent.conf