Apigee Edge for Private Cloud includes apigee-monit
, a tool based on the open source
monit utility. apigee-monit
periodically
polls Edge services; if a service is unavailable, then apigee-monit
attempts to restart it.
To use apigee-monit
, you must install it manually. It is not part of the
standard installation.
By default, apigee-monit
checks the status of Edge services every 60 seconds.
Quick start
This section shows you how to quickly get up and running with apigee-monit
.
If you are using Amazon Linux or Oracle-Linux-7.X, first install monit via Fedora. Otherwise, skip this step.
sudo yum install -y https://kojipkgs.fedoraproject.org/packages/monit/5.25.1/1.el6/x86_64/monit-5.25.1-1.el6.x86_64.rpm
To install apigee-monit
, do the following steps:
Install apigee-monit |
|
/opt/apigee/apigee-service/bin/apigee-service apigee-monit install This installs |
|
Stop monitoring components | |
/opt/apigee/apigee-service/bin/apigee-service apigee-monit unmonitor -c component_name /opt/apigee/apigee-service/bin/apigee-service apigee-monit unmonitor -c all |
|
Start monitoring components | |
/opt/apigee/apigee-service/bin/apigee-service apigee-monit monitor -c component_name /opt/apigee/apigee-service/bin/apigee-service apigee-monit monitor -c all |
|
Get summary status information | |
/opt/apigee/apigee-service/bin/apigee-service apigee-monit report /opt/apigee/apigee-service/bin/apigee-service apigee-monit summary |
|
Look at the apigee-monit log files |
|
cat /opt/apigee/var/log/apigee-monit/apigee-monit.log |
Each of these topics and others are described in detail in the sections that follow.
About apigee-monit
apigee-monit
helps ensure that all components on a node stay up and running. It does this by
providing a variety of services, including:
- Restarting failed services
- Displaying summary information
- Logging monitoring status
- Sending notifications
- Monitoring non-Edge services
Apigee recommends that you monitor apigee-monit
to ensure that it is running. For more information,
see Monitor apigee-monit.
apigee-monit architecture
During your Apigee Edge for Private Cloud installation and configuration, you optionally install a separate instance
of apigee-monit
on each node in your cluster. These separate apigee-monit
instances operate independently of one
another: they do not communicate the status of their components to the other nodes, nor do they
communicate failures of the monitoring utility itself to any central service.
The following image shows the apigee-monit
architecture in a 5-node cluster:
Component configurations
apigee-monit
uses component configurations to determine which components to monitor, which
aspects of the component to check, and what action to take in the event of a failure.
By default, apigee-monit
monitors all Edge components on a node using their pre-defined component
configurations. To view the default settings, you can look at the apigee-monit
component configuration
files. You cannot change the default component configurations.
apigee-monit
checks different aspects of a component, depending on which component it is checking. The
following table lists what apigee-monit
checks for each component and shows you where the component
configuration is for each component. Note that some components are defined in a single configuration
file, which others have their own configurations.
Component | Configuration location | What is monitored |
---|---|---|
Management Server | /opt/apigee/edge-management-server/monit/default.conf |
apigee-monit checks:
In addition, for these components
|
Message Processor | /opt/apigee/edge-message-processor/monit/default.conf |
|
Postgres Server | /opt/apigee/edge-postgres-server/monit/default.conf |
|
Qpid Server | /opt/apigee/edge-qpid-server/monit/default.conf |
|
Router | /opt/apigee/edge-router/monit/default.conf |
|
Cassandra Edge UI OpenLDAP Postgres Qpid Zookeeper |
/opt/apigee/data/apigee-monit/monit.conf |
apigee-monit checks:
|
The following example shows the default component configuration for the edge-router
component:
check host edge-router with address localhost restart program = "/opt/apigee/apigee-service/bin/apigee-service edge-router monitrestart" if failed host 10.1.1.0 port 8081 and protocol http and request "/v1/servers/self/uuid" with timeout 15 seconds for 2 times within 3 cycles then restart if failed port 15999 and protocol http and request "/v1/servers/self" and status < 600 with timeout 15 seconds for 2 times within 3 cycles then restart
The following example shows the default configuration for the Classic UI (edge-ui
)
component:
check process edge-ui with pidfile /opt/apigee/var/run/edge-ui/edge-ui.pid start program = "/opt/apigee/apigee-service/bin/apigee-service edge-ui start" with timeout 55 seconds stop program = "/opt/apigee/apigee-service/bin/apigee-service edge-ui stop"
This applies to the Classic UI, not the new Edge UI whose component name is
edge-management-ui
.
You cannot change the default component configurations for any Apigee Edge for Private Cloud component. You can,
however, add your own component configurations for external services, such as your target endpoint
or the httpd
service. For more information, see
Non-Apigee component configurations.
By default, apigee-monit
monitors all components on a node on which it is running. You can enable or
disable it for all components or for individual components. For more information, see:
Install apigee-monit
apigee-monit
is not installed by default; you can install it manually after upgrading or installing
version 4.19.01 or later of Apigee Edge for Private Cloud.
This section describes how to install apigee-monit
.
For information on uninstalling apigee-monit
, see
Uninstall apigee-monit
.
Install apigee-monit
This section describes how to install apigee-monit
.
To install apigee-monit
:
- Install
apigee-monit
with the following command:/opt/apigee/apigee-service/bin/apigee-service apigee-monit install
- Configure
apigee-monit
with the following command:/opt/apigee/apigee-service/bin/apigee-service apigee-monit configure
- Start
apigee-monit
with the following command:/opt/apigee/apigee-service/bin/apigee-service apigee-monit start
- Repeat this procedure on each node in your cluster.
Stop and start monitoring components
When a service stops for any reason, apigee-monit
attempts to restart the service.
This can cause a problem if you want to purposefully stop a component. For example, you might
want to stop a component when you need to back it up or upgrade it.
If apigee-monit
restarts the service during the backup or upgrade, your maintenance procedure can be
disrupted, possibly causing it to fail.
The following sections show the options for stopping the monitoring of components.
Stop a component and unmonitor it
To stop a component and unmonitor it, execute the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit stop-component -c component_name
apigee-cassandra
(Cassandra)apigee-openldap
(OpenLDAP)apigee-postgresql
(PostgreSQL database)apigee-qpidd
(Qpidd)apigee-sso
(Edge SSO)apigee-zookeeper
(ZooKeeper)edge-management-server
(Management Server)edge-management-ui
(new Edge UI)edge-message-processor
(Message Processor)edge-postgres-server
(Postgres Server)edge-qpid-server
(Qpid Server)edge-router
(Edge Router)edge-ui
(Classic UI)
Note that "all" is not a valid option for stop-component
. You can stop and
unmonitor only one component at a time with stop-component
.
To re-start the component and resume monitoring, execute the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit start-component -c component_name
Note that "all" is not a valid option for start-component
.
For instructions on how to stop and unmonitor all components, see Stop all components and unmonitor them.
Unmonitor a component (but don't stop it)
To unmonitor a component (but don't stop it), execute the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit unmonitor -c component_name
apigee-cassandra
(Cassandra)apigee-openldap
(OpenLDAP)apigee-postgresql
(PostgreSQL database)apigee-qpidd
(Qpidd)apigee-sso
(Edge SSO)apigee-zookeeper
(ZooKeeper)edge-management-server
(Management Server)edge-management-ui
(new Edge UI)edge-message-processor
(Message Processor)edge-postgres-server
(Postgres Server)edge-qpid-server
(Qpid Server)edge-router
(Edge Router)edge-ui
(Classic UI)
To resume monitoring the component, execute the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit monitor -c component_name
Unmonitor all components (but don't stop them)
To unmonitor all components (but don't stop them), execute the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit unmonitor -c all
To resume monitoring all components, execute the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit monitor -c all
Stop all components and unmonitor them
To stop all components and unmonitor them, execute the following commands:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit unmonitor -c all
/opt/apigee/apigee-service/bin/apigee-all stop
To re-start all components and resume monitoring, execute the following commands:
/opt/apigee/apigee-service/bin/apigee-all start
/opt/apigee/apigee-service/bin/apigee-service apigee-monit monitor -c all
To stop monitoring all components, you can also disable apigee-monit
, as described in
Stop, start, and disable apigee-monit.
Stop, start, and disable apigee-monit
As with any service, you can stop and start apigee-monit
using the apigee-service
command. In addition, apigee-monit
supports the unmonitor
command, which lets you
temporarily stop monitoring components.
Stop apigee-monit
To stop apigee-monit
, use the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit stop
Start apigee-monit
To start apigee-monit
, use the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit start
Disable apigee-monit
You can suspend monitoring all components on the node by using the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit unmonitor -c all
Alternatively, you can permanently disable apigee-monit
by uninstalling it from the node, as described
in Uninstall apigee-monit
.
Uninstall apigee-monit
To uninstall apigee-monit
:
- If you set up a
cron
job to monitorapigee-monit
, remove thecron
job before uninstallingapigee-monit
:sudo rm /etc/cron.d/apigee-monit.cron
- Stop
apigee-monit
with the following command:/opt/apigee/apigee-service/bin/apigee-service apigee-monit stop
- Uninstall
apigee-monit
with the following command:/opt/apigee/apigee-service/bin/apigee-service apigee-monit uninstall
- Repeat this procedure on each node in your cluster.
Monitor a newly installed component
If you install a new component on a node that is running apigee-monit
, you can begin monitoring it
by executing apigee-monit
's restart
command. This generates a new monit.conf file that will
include the new component in its component configurations.
The following example restarts apigee-monit
:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit restart
Customize apigee-monit
You can customize various apigee-monit
settings, including:
- Default
apigee-monit
control settings - Global configuration settings
- Non-Apigee component configurations
Default apigee-monit control settings
You can customize the default apigee-monit
control settings such as the frequency of status checks and
the locations of the apigee-monit
files. You do this by editing a properties file using the
code with config technique. Properties files will persist even
after you upgrade Apigee Edge for Private Cloud.
The following table describes the default apigee-monit
control settings that you can customize:
Property | Description |
---|---|
conf_monit_httpd_port |
The httpd daemon's port. apigee-monit uses httpd for its dashboard
app and to enable reports/summaries. The default value is 2812.
|
conf_monit_httpd_allow |
Constraints on requests to the httpd daemon. apigee-monit uses
httpd to run its dashboard app and enable reports/summaries. This value must
point to the localhost (the host that httpd is running on.
To require that requests include a username and password, use the following syntax: conf_monit_httpd_allow=allow username:"password"\nallow 127.0.0.1 When adding a username and password, insert a "\n" between each constraint. Do not insert actual newlines or carriage returns in the value. |
conf_monit_monit_datadir |
The directory in which event details are stored. |
conf_monit_monit_delay_time |
The amount of time that apigee-monit waits after it is first loaded into memory before it
runs. This affects apigee-monit the first process check only. |
conf_monit_monit_logdir |
The location of the apigee-monit log file. |
conf_monit_monit_retry_time |
The frequency at which apigee-monit attempts to check each process; the default is 60
seconds. |
conf_monit_monit_rundir |
The location of the PID and state files, which apigee-monit uses for checking processes. |
To customize the default apigee-monit
control settings:
- Edit the following file:
/opt/apigee/customer/application/monit.properties
If the file does not exist, create it and set the owner to the "apigee" user:
chown apigee:apigee /opt/apigee/customer/application/monit.properties
Note that if the file already exists, there may be additional configuration properties defined in it beyond what is listed in the table above. You should not modify properties other than those listed above.
Set or replace property values with your new values.
For example, to change the location of the log file to
/tmp
, add or edit the following property:conf_monit_monit_logdir=/tmp/apigee-monit.log
- Save your changes to the
monit.properties
file. - Re-configure
apigee-monit
with the following command:/opt/apigee/apigee-service/bin/apigee-service apigee-monit configure
- Reload
apigee-monit
with the following command:/opt/apigee/apigee-service/bin/apigee-service apigee-monit reload
If you cannot restart
apigee-monit
, check the log file for errors as described in Accessapigee-monit
log files. - Repeat this procedure for each node in your cluster.
Global configuration settings
You can define global configuration settings for apigee-monit
; for example, you can add email
notifications for alerts. You do this by creating a configuration file in the
/opt/apigee/data/apigee-monit
directory and then restarting apigee-monit
.
To define global configuration settings for apigee-monit
:
- Create a new component configuration file in the following location:
/opt/apigee/data/apigee-monit/filename.conf
Where filename can be any valid file name, except "monit".
- Change the owner of the new configuration file to the "apigee" user, as the following example
shows:
chown apigee:apigee /opt/apigee/data/apigee-monit/my-mail-config.conf
- Add your global configuration settings to the new file. The following example configures a
mail server and sets the alert recipients:
SET MAILSERVER smtp.gmail.com PORT 465 USERNAME "example-admin@gmail.com" PASSWORD "PASSWORD" USING SSL, WITH TIMEOUT 15 SECONDS SET MAIL-FORMAT { from: edge-alerts@example.com subject: Monit Alert -- Service: $SERVICE $EVENT on $HOST } SET ALERT fred@example.com SET ALERT nancy@example.com
For a complete list of global configuration options, see the monit documentation.
- Save your changes to the component configuration file.
- Reload
apigee-monit
with the following command:/opt/apigee/apigee-service/bin/apigee-service apigee-monit reload
If
apigee-monit
does not restart, check the log file for errors as described in Accessapigee-monit
log files. - Repeat this procedure for each node in your cluster.
Non-Apigee component configurations
You can add your own configurations to apigee-monit
so that it will check services that are not part of
Apigee Edge for Private Cloud. For example, you can use apigee-monit
to check that your APIs are running by sending requests to
your target endpoint.
To add a non-Apigee component configuration:
- Create a new component configuration file in the following location:
/opt/apigee/data/apigee-monit/filename.conf
Where filename can be any valid file name, except "monit".
You can create as many component configuration files as necessary. For example, you can create a separate configuration file for each non-Apigee component that you want to monitor on the node.
- Change the owner of the new configuration file to the "apigee" user, as the following example
shows:
chown apigee:apigee /opt/apigee/data/apigee-monit/my-config.conf
- Add your custom configurations to the new file. The following example checks the target
endpoint on the local server:
CHECK HOST localhost_validate_test WITH ADDRESS localhost IF FAILED PORT 15999 PROTOCOL http REQUEST "/validate__test" CONTENT = "Server Ready" FOR 2 times WITHIN 3 cycles THEN alert
For a complete list of possible configuration settings, see the monit documentation.
- Save your changes to the configuration file.
- Reload
apigee-monit
with the following command:/opt/apigee/apigee-service/bin/apigee-service apigee-monit reload
If
apigee-monit
does not restart, check the log file for errors as described in Accessapigee-monit
log files. - Repeat this procedure for each node in your cluster.
Note that this is for non-Edge components only. You cannot customize the component configurations for Edge components.
Access apigee-monit log files
apigee-monit
logs all activity, including events, restarts, configuration changes, and alerts in a log
file.
The default location of the log file is:
/opt/apigee/var/log/apigee-monit/apigee-monit.log
You can change the default location by customizing the apigee-monit
control
settings.
Log file entries have the following form:
'edge-message-processor' trying to restart [UTC Dec 14 16:20:42] info : 'edge-message-processor' trying to restart 'edge-message-processor' restart: '/opt/apigee/apigee-service/bin/apigee-service edge-message-processor monitrestart'
You cannot customize the format of the apigee-monit
log file entries.
View aggregated status with apigee-monit
apigee-monit
includes the following commands that give you aggregated status information about the
components on a node:
Command | Usage |
---|---|
report |
/opt/apigee/apigee-service/bin/apigee-service apigee-monit report |
summary |
/opt/apigee/apigee-service/bin/apigee-service apigee-monit summary |
Each of these commands is explained in more detail in the sections that follow.
report
The report
command gives you a rolled-up summary of how many components are up,
down, currently being initialized, or are currently unmonitored on a node. The
following example invokes the report
command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit report
The following example shows report
output on an AIO (all-in-one)
configuration:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit report up: 11 (100.0%) down: 0 (0.0%) initialising: 0 (0.0%) unmonitored: 1 (8.3%) total: 12 services
In this example, 11 of the 12 services are reported by apigee-monit
as being up. One service is not
currently being monitored.
You may get a Connection refused
error when you first execute the
report
command. In this case, wait for the duration of the
conf_monit_monit_delay_time
property, and then try again.
summary
The summary
command lists each component and provides its status. The following
example invokes the summary
command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit summary
The following example shows summary
output on an AIO (all-in-one)
configuration:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit summary Monit 5.25.1 uptime: 4h 20m Service Name Status Type host_name OK System apigee-zookeeper OK Process apigee-cassandra OK Process apigee-openldap OK Process apigee-qpidd OK Process apigee-postgresql OK Process edge-ui OK Process edge-qpid-server OK Remote Host edge-postgres-server OK Remote Host edge-management-server OK Remote Host edge-router OK Remote Host edge-message-processor OK Remote Host
If you get a Connection refused
error when you first execute the
summary
command, try waiting the duration of the
conf_monit_monit_delay_time
property, and then try again.
Monitor apigee-monit
It is best practice to regularly check that apigee-monit
is running on each node.
To check that apigee-monit
is running, use the following command:
/opt/apigee/apigee-service/bin/apigee-service apigee-monit monitor_monit
Apigee recommends that you issue this command periodically on each node that is running apigee-monit
.
One way to do this is with a utility such as cron
that executes scheduled tasks at
pre-defined intervals.
To use cron
to monitor apigee-monit
:
- Add
cron
support by copying theapigee-monit.cron
directory to the/etc/cron.d
directory, as the following example shows:cp /opt/apigee/apigee-monit/cron/apigee-monit.cron /etc/cron.d/
- Open the
apigee-monit.cron
file to edit it.The
apigee-monit.cron
file defines thecron
job to execute as well as the frequency at which to execute that job. The following example shows the default values:# Cron entry to check if monit process is running. If not start it */2 * * * * root /opt/apigee/apigee-service/bin/apigee-service apigee-monit monitor_monit
This file uses the following syntax, in which the first five fields define the time at which
apigee-monit
executes its action:min hour day_of_month month day_of_week task_to_execute
For example, the default execution time is
*/2 * * * *
, which instructscron
to check theapigee-monit
process every 2 minutes.You cannot execute a
cron
job more frequently than once per minute.For more information on using
cron
, see your server OS's documentation or man pages. - Change the
cron
settings to match your organization's policies. For example, to change the execution frequency to every 5 minutes, set the job definition to the following:*/5 * * * * root /opt/apigee/apigee-service/bin/apigee-service apigee-monit monitor_monit
- Save the
apigee-monit.cron
file. - Repeat this procedure for each node in your cluster.
If cron
does not begin watching apigee-monit
, check that:
- There is a blank line after the
cron
job definition. - There is only one
cron
job defined in the file. (Commented lines do not count.)
If you want to stop or temporarily disable apigee-monit
, you must disable this cron
job,
too, otherwise cron
will restart apigee-monit
.
To disable cron
, do one of the following:
- Delete the
/etc/cron.d/apigee-monit.cron
file:sudo rm /etc/cron.d/apigee-monit.cron
You will have to re-copy it if you later want to re-enable
cron
to watchapigee-monit
.OR
- Edit the
/etc/cron.d/apigee-monit.cron
file and comment out the job definition by adding a "#" to the beginning of the line; for example:# 10 * * * * root /opt/apigee/apigee-service/bin/apigee-service apigee-monit monitor_monit