Monitor system health#
About monitoring system health#
System administrators can use tools like Celery and Supervisor to monitor Intelligence Center tasks to check day-to-day operations, and to investigate in case of issues.
Monitor EclecticIQ Intelligence Center to ensure normal operation to research and identify the root cause of an issue, and to inspect the status of key Intelligence Center processes such as incoming and outgoing feeds, enrichers, ingestion queues, and tasks.
In the current context, monitoring covers on/off status only: the tasks and the commands described here enable verifying whether a task, a process, or a component is running or not.
Metrics and other types of measurements are outside the scope of the topic.
System administrators and DevOps engineers can run quick checks to inspect Intelligence Center operation, to identify issues and review errors, so that they can address them in a timely manner.
About root-level access#
To successfully execute commands in the command line or in the terminal, you may require root-level access rights.
Obtain root-level access by running
sudo -i
:# Root-access login shell sudo -i
To access resources as a different user than the currently active one, append
-u
:# Grant the currently logged in user root-level access sudo -i # Grant root-level access to a different user sudo -i -u ${user_name} # Run a command as a different user, with root-level access sudo -i -u ${user_name} ${command} ${options}
Tools#
Celery |
The task runner. It manages task execution and scheduling. |
Redis |
The message broker. It handles background task processing by managing message queues based on the pub-sub pattern. |
systemd |
The initialization system to bootstrap processes and start services. The process manager to start and stop processes. |
Core components#
Component |
Address |
Port |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Monitoring#
Intelligence Center monitoring covers two main areas:
Services |
|
---|---|
|
The web application implementing EclecticIQ Intelligence Center API and the API endpoints. The endpoints expose services that can be consumed by making API calls and by passing arguments. |
|
Web server. |
|
Email server. |
|
TAXII server responsible for STIX data transport. |
|
PostgreSQL (main database). |
|
Redis (message broker). |
|
Elasticsearch search and indexing database. |
|
Generates dashboard graphs. |
|
Log and data aggregation, data pipeline and funneling. |
|
Neo4j graph database. |
|
Gather stats such as counters, timers, discovered entities and so on, and it sends aggregates to Kibana through Elasticsearch. |
Processes |
|
---|---|
|
Data funnel to the Neo4j graph database. Handles data updates for Neo4j. |
|
Intel ingestion through feeds and enrichers. Consumes incoming data and saves it PostgreSQL, Neo4j, and Elasticsearch. EclecticIQ Intelligence Center executes one Running tasks are sequentially numbered starting
from For example, a Intelligence Center instance running on a quad
core machine normally executes 4 such processes,
progressively numbered from |
|
Neo4j graph database batch processing application. It lives on the same server hosting the Neo4j database. It prepares data for ingestion into the Neo4j database. |
|
Search indexer. Handles Elasticsearch data updates. |
|
Celery-managed tasks such as enrichers, feed integrations, incoming feed data providers, and utilities. |
Feeds |
Incoming and outgoing feeds. |
Enrichers |
Enricher tasks. |
Celery tasks |
Other/Misc. Celery tasks. |
Monitor components with systemd#
Tool: systemd helps you inspect Intelligence Center components to verify if their services are running normally.
Use it to check the following components:
Component |
Description |
If it is not running… |
---|---|---|
|
Elasticsearch search and indexing database. |
No data searching and indexing capabilities are available. |
|
Generates dashboard graphs. |
The dashboard does not load correctly. The /kibana/ API endpoint returns a HTTP 502 error. |
|
Log and data aggregation, data pipeline and funneling. |
No data aggregation, deduplication, and normalization. |
|
Neo4j graph database. |
Graph data queries stop working. It is not possible to poll the graph database. |
|
Neo4j graph database batch processing application. It lives on the same server hosting the Neo4j database. It pre-processes and it queues data for ingestion into the Neo4j database. |
Graph ingestion may slow down, and it may hang until it stops. |
|
Web server. |
No network connectivity to EclecticIQ Intelligence Center. Requests to the web server return an HTTP 500 error. |
|
TAXII server responsible for STIX data transport. |
It is not possible to send or to receive data through the TAXII transport protocol. |
|
Email server. |
No automatic email notifications. |
|
PostgreSQL (main database). |
It is not possible to access Intelligence Center data. |
|
Redis (message broker). |
Tasks and processes may hang and/or behave unexpectedly. |
statsite |
Gathers stats such as counters, timers, discovered entities and so on. It sends aggregates to Kibana through Elasticsearch. |
No metrics about discovered entities, feed updates, and so on. |
systemctl
is systemd’s command line interface utility.
The commands can optionally take options:
systemctl ${options} ${command} ${component_name}
For a complete list of supported commands and options, see the systemd documentation.
To obtain a list of all running services, run the following command(s):
systemctl
The response is displayed in the following format:
UNIT LOAD ACTIVE SUB JOB DESCRIPTION
${service_name} ${loaded_or_not} ${active_or_not} ${running_or_not} ${description_of_the_job}
To verify if Nginx is running, run the following command(s):
systemctl status -l nginx.service
To verify if PostgreSQL is running, run the following command(s):
systemctl status -l postgresql-11.service
To verify if Redis is running, run the following command(s):
systemctl status -l redis.service
To verify if Logstash is running, run the following command(s):
systemctl status -l logstash.service
To verify if Elasticsearch is running, run the following command(s):
systemctl status -l elasticsearch.service
To retrieve status information about all these services at once, run the following command(s):
systemctl status -l nginx.service postgresql-11.service redis.service logstash.service elasticsearch.service
To retrieve status information about all the systemd-managed services whose name contains a specific search string, run the following command(s):
systemctl | grep "${search_string}"
Monitor processes#
Monitor ingestion queues with Redis#
Tool: Redis acts as a message broker for Celery-managed tasks.
redis-cli
is Redis’s command line interface utility:
# Launch redis-cli
redis-cli
> ${command} ${item_name}
For a complete list of supported commands and options, see
the redis-cli
command reference.
Within EclecticIQ Intelligence Center Redis manages task queues.
Possibly the only command you need is the one that enables
checking queue length: llen
.
To inspect EclecticIQ Intelligence Center data ingestion queue length:
# Launch redis-cli
redis-cli
> llen "queue:ingestion:inbound"
To inspect the graph database queue length:
> llen "queue:graph:inbound"
To inspect the Elasticsearch data update queue length:
> llen "queue:search:inbound"
Monitor running tasks with Celery#
Tool: Celery is the task runner that manages task execution and scheduling.
To use Celery to request task information, pass the following environment variable(s) with your request:
export EIQ_PLATFORM_SETTINGS=/etc/eclecticiq/platform_settings.py
Append Celery commands after the environment variable(s).
Celery commands have the following format:
celery -A ${module_name} ${command}
Ping Celery to see which tasks are up and listening.
This is the easiest way to check task running status. All active tasks reply with pong.
export EIQ_PLATFORM_SETTINGS=/etc/eclecticiq/platform_settings.py
/opt/eclecticiq/platform/api/bin/celery -A eiq.platform.taskrunner.app inspect ping
To inspect active tasks, run the following command(s):
export EIQ_PLATFORM_SETTINGS=/etc/eclecticiq/platform_settings.py
/opt/eclecticiq/platform/api/bin/celery -A eiq.platform.taskrunner.app inspect active
To inspect active tasks queues, run the following command(s):
export EIQ_PLATFORM_SETTINGS=/etc/eclecticiq/platform_settings.py
/opt/eclecticiq/platform/api/bin/celery -A eiq.platform.taskrunner.app inspect active_queues
To inspect scheduled tasks, run the following command(s):
export EIQ_PLATFORM_SETTINGS=/etc/eclecticiq/platform_settings.py
/opt/eclecticiq/platform/api/bin/celery -A eiq.platform.taskrunner.app inspect scheduled
To inspect overall task status, run the following command(s):
export EIQ_PLATFORM_SETTINGS=/etc/eclecticiq/platform_settings.py
/opt/eclecticiq/platform/api/bin/celery -A eiq.platform.taskrunner.app status
To request task statistics (exhaustive, but it can be verbose), run the following command(s):
export EIQ_PLATFORM_SETTINGS=/etc/eclecticiq/platform_settings.py
/opt/eclecticiq/platform/api/bin/celery -A eiq.platform.taskrunner.app inspect stats
(For further details, see the documentation on Celery ping, Celery workers, Celery worker statistics, and Celery monitoring)