Monitor system health
Contents
About monitoring system health
System administrators can use tools like Celery and Supervisor to monitor Intelligence Center tasks to check day-to-day operations, and to investigate in case of issues.
Monitor the Intelligence Center to ensure normal operation to research and identify the root cause of an issue, and to inspect the status of key Intelligence Center processes such as incoming and outgoing feeds, enrichers, ingestion queues, and tasks.
In the current context, monitoring covers on/off status only: the tasks and the commands described here enable verifying whether a task, a process, or a component is running or not.
Metrics and other types of measurements are outside the scope of the topic.
System administrators and DevOps engineers can run quick checks to inspect Intelligence Center operation, to identify issues and review errors, so that they can address them in a timely manner.
About root-level access
To successfully execute commands in the command line or in the terminal, you may require root-level access rights.
Obtain root-level access by running sudo -i:
# Root-access login shell
sudo
-
i
To access resources as a different user than the currently active one, append -u:
# Grant the currently logged in user root-level access
sudo
-
i
# Grant root-level access to a different user
sudo
-
i
-
u ${user_name}
# Run a command as a different user, with root-level access
sudo
-
i
-
u ${user_name} ${command} ${options}
Tools
Celery |
The task runner. It manages task execution and scheduling. |
Redis |
The message broker. It handles background task processing by managing message queues based on the pub-sub pattern. |
systemd |
The initialization system to bootstrap processes and start services. The process manager to start and stop processes. |
Core components
Component |
Address |
Port |
elasticsearch |
localhost |
9200 |
kibana |
localhost |
5601 |
logstash |
localhost |
6755 |
neo4j |
localhost |
7474 7473 |
eclecticiq-neo4jbatcher |
127.0.0.1 |
4008 |
nginx |
${web_server_name} |
80 443 |
opentaxii |
127.0.0.1 |
9000 |
platform |
localhost |
8008 |
postfix |
${postfix_host_name} |
25 587 |
postgresql |
localhost |
5432 |
redis |
localhost |
6379 |
statsite |
127.0.0.1 |
8125 |
Monitoring
Intelligence Center monitoring covers two main areas:
Services |
|
platform-api |
The web application implementing the Intelligence Center API and the API endpoints. The endpoints expose services that can be consumed by making API calls and by passing arguments. |
nginx |
Web server. |
postfix |
Email server. |
opentaxii |
TAXII server responsible for STIX data transport. |
postgresql-11 |
PostgreSQL (main database). |
redis |
Redis (message broker). |
elasticsearch |
Elasticsearch search and indexing database. |
kibana |
Generates dashboard graphs. |
logstash |
Log and data aggregation, data pipeline and funneling. |
neo4j |
Neo4j graph database. |
statsite |
Gather stats such as counters, timers, discovered entities and so on, and it sends aggregates to Kibana through Elasticsearch. |
Processes |
|
graph-ingestion |
Data funnel to the Neo4j graph database. Handles data updates for Neo4j. |
intel-ingestion |
Intel ingestion through feeds and enrichers. Consumes incoming data and saves it PostgreSQL, Neo4j, and Elasticsearch. the Intelligence Center executes one intel-ingestion per processor core. Running tasks are sequentially numbered starting from 0. For example, a Intelligence Center instance running on a quad core machine normally executes 4 such processes, progressively numbered from intel-ingestion:0 to intel-ingestion:3. |
eclecticiq-neo4jbatcher |
Neo4j graph database batch processing application. It lives on the same server hosting the Neo4j database. It prepares data for ingestion into the Neo4j database. |
search-ingestion |
Search indexer. Handles Elasticsearch data updates. |
task |
Celery-managed tasks such as enrichers, feed integrations, incoming feed data providers, and utilities. |
Feeds |
Incoming and outgoing feeds. |
Enrichers |
Enricher tasks. |
Celery tasks |
Other/Misc. Celery tasks. |
Monitor components with systemd
Tool: systemd helps you inspect Intelligence Center components to verify if their services are running normally.
Use it to check the following components:
Component |
Description |
If it is not running… |
elasticsearch |
Elasticsearch search and indexing database. |
No data searching and indexing capabilities are available. |
kibana |
Generates dashboard graphs. |
The dashboard does not load correctly. The /kibana/ API endpoint returns a HTTP 502 error. |
logstash |
Log and data aggregation, data pipeline and funneling. |
No data aggregation, deduplication, and normalization. |
neo4j |
Neo4j graph database. |
Graph data queries stop working. It is not possible to poll the graph database. |
eclecticiq-neo4jbatcher |
Neo4j graph database batch processing application. It lives on the same server hosting the Neo4j database. It pre-processes and it queues data for ingestion into the Neo4j database. |
Graph ingestion may slow down, and it may hang until it stops. |
nginx |
Web server. |
No network connectivity to the Intelligence Center. Requests to the web server return an HTTP 500 error. |
opentaxii |
TAXII server responsible for STIX data transport. |
It is not possible to send or to receive data through the TAXII transport protocol. |
postfix |
Email server. |
No automatic email notifications. |
postgresql-11 |
PostgreSQL (main database). |
It is not possible to access Intelligence Center data. |
redis |
Redis (message broker). |
Tasks and processes may hang and/or behave unexpectedly. |
statsite |
Gathers stats such as counters, timers, discovered entities and so on. It sends aggregates to Kibana through Elasticsearch. |
No metrics about discovered entities, feed updates, and so on. |
systemctl is systemd’s command line interface utility.
The commands can optionally take options:
systemctl ${options} ${command} ${component_name}
For a complete list of supported commands and options, see the systemd documentation.
To obtain a list of all running services, run the following command(s):
systemctl
The response is displayed in the following format:
UNIT LOAD ACTIVE SUB JOB DESCRIPTION
${service_name} ${loaded_or_not} ${active_or_not} ${running_or_not} ${description_of_the_job}
To verify if Nginx is running, run the following command(s):
systemctl status
-
l nginx.service
To verify if PostgreSQL is running, run the following command(s):
systemctl status
-
l postgresql
-
11.service
To verify if Redis is running, run the following command(s):
systemctl status
-
l redis.service
To verify if Logstash is running, run the following command(s):
systemctl status
-
l logstash.service
To verify if Elasticsearch is running, run the following command(s):
systemctl status
-
l elasticsearch.service
To retrieve status information about all these services at once, run the following command(s):
systemctl status
-
l nginx.service postgresql
-
11.service
redis.service logstash.service elasticsearch.service
To retrieve status information about all the systemd-managed services whose name contains a specific search string, run the following command(s):
systemctl | grep
"${search_string}"
Monitor processes
Monitor ingestion queues with Redis
Tool: Redis acts as a message broker for Celery-managed tasks.
redis-cli is Redis’s command line interface utility:
# Launch redis-cli
redis
-
cli
> ${command} ${item_name}
For a complete list of supported commands and options, see the redis-cli command reference.
Within the Intelligence Center Redis manages task queues.
Possibly the only command you need is the one that enables checking queue length: llen.
To inspect the Intelligence Center data ingestion queue length:
# Launch redis-cli
redis
-
cli
> llen
"queue:ingestion:inbound"
To inspect the graph database queue length:
> llen
"queue:graph:inbound"
To inspect the Elasticsearch data update queue length:
> llen
"queue:search:inbound"
Monitor running tasks with Celery
Tool: Celery is the task runner that manages task execution and scheduling.
To use Celery to request task information, pass the following environment variable(s) with your request:
export EIQ_PLATFORM_SETTINGS
=
/
etc
/
eclecticiq
/
platform_settings.py
Append Celery commands after the environment variable(s).
Celery commands have the following format:
celery
-
A ${module_name} ${command}
Ping Celery to see which tasks are up and listening.
This is the easiest way to check task running status. All active tasks reply with pong.
export EIQ_PLATFORM_SETTINGS
=
/
etc
/
eclecticiq
/
platform_settings.py
/
opt
/
eclecticiq
/
platform
/
api
/
bin
/
celery
-
A eiq.platform.taskrunner.app inspect ping
To inspect active tasks, run the following command(s):
export EIQ_PLATFORM_SETTINGS
=
/
etc
/
eclecticiq
/
platform_settings.py
/
opt
/
eclecticiq
/
platform
/
api
/
bin
/
celery
-
A eiq.platform.taskrunner.app inspect active
To inspect active tasks queues, run the following command(s):
export EIQ_PLATFORM_SETTINGS
=
/
etc
/
eclecticiq
/
platform_settings.py
/
opt
/
eclecticiq
/
platform
/
api
/
bin
/
celery
-
A eiq.platform.taskrunner.app inspect active_queues
To inspect scheduled tasks, run the following command(s):
export EIQ_PLATFORM_SETTINGS
=
/
etc
/
eclecticiq
/
platform_settings.py
/
opt
/
eclecticiq
/
platform
/
api
/
bin
/
celery
-
A eiq.platform.taskrunner.app inspect scheduled
To inspect overall task status, run the following command(s):
export EIQ_PLATFORM_SETTINGS
=
/
etc
/
eclecticiq
/
platform_settings.py
/
opt
/
eclecticiq
/
platform
/
api
/
bin
/
celery
-
A eiq.platform.taskrunner.app status
To request task statistics (exhaustive, but it can be verbose), run the following command(s):
export EIQ_PLATFORM_SETTINGS
=
/
etc
/
eclecticiq
/
platform_settings.py
/
opt
/
eclecticiq
/
platform
/
api
/
bin
/
celery
-
A eiq.platform.taskrunner.app inspect stats
(For further details, see the documentation on Celery ping, Celery workers, Celery worker statistics, and Celery monitoring)