Ingestion for sysadmins CentOS
A technical reference overview of the backend components of the platform responsible for ingesting incoming data, processing it, and storing it in the main data store, the indexing and search database, and the graph database.
Ingestion
Step |
Service or process |
Actions |
1 |
Incoming feed workers |
Each incoming feed worker:
|
2 |
Incoming feed workers |
Each incoming feed worker:
|
3 |
Ingestion workers |
Each incoming feed worker:
|
4 |
Ingestion workers |
The default action to execute is synchronous. The failover action is asynchronous. Each ingestion worker:
|
5 |
Ingestion workers |
The default action to execute is synchronous. The failover action is asynchronous. Each ingestion worker:
|
6 |
Graph ingestion worker |
|
7 |
Neo4j |
Executes the query and loads the CSV data for graph ingestion and indexing. Query example:
|
Core components
For more information about the eiq-platform command line tool, see eiq-platform command line.
eclecticiq-platform-backend-workers.service (sourced from EIQ platform-backend)
Component |
Description |
platform-api |
platform-api is the main Python REST API that enables communications with the platform and its components. It is based on the Flask web application framework. There is only one running instance of the platform-api process. The instance integrates the Gunicorn web server gateway interface to exchange data with the Nginx web server, which acts as a proxy, through port 8008. platform-api exchanges data with PostgreSQL, Neo4j, Elasticsearch, and Redis. |
intel-ingestion |
eclecticiq-platform-backend-ingestion drives the ingestion processing pipeline. The default configuration spawns 4 eclecticiq-platform-backend-ingestion workers, corresponding to the number of internal threads managing concurrent user requests. eclecticiq-platform-backend-ingestion exchanges data with PostgreSQL, Neo4j, Elasticsearch, and Redis. Data traffic depends on the amount of incoming packages that are queued up for processing. It is possible to increase and decrease the amount of concurrent workers using systemctl commands. For example, to decrease the default active workers from 4 to to 2: systemctl disable eclecticiq-platform-backend-ingestion@{3,4} systemctl stop eclecticiq-platform-backend-ingestion@{3,4} To increase the default active workers from 4 to 6: systemctl enable eclecticiq-platform-backend-ingestion@{1..6} systemctl start eclecticiq-platform-backend-ingestion@{1..6} To restart the default workers: systemctl restart eclecticiq-platform-backend-ingestion To run the command manually: eiq-platform ingestion run Redis acts as a message broker:
|
graph-ingestion |
eclecticiq-platform-backend-graphindex drives data ingestion and indexing to the graph database. There is one running instance of eclecticiq-platform-backend-graphindex. It exchanges data with Neo4j and Redis. To restart the worker: systemctl restart eclecticiq-platform-backend-graphindex To run the command manually: eiq - platform graph run - indexer Redis acts as a message broker:
|
search-ingestion |
eclecticiq-platform-backend-searchindex drives data indexing to the indexing and search database. There is one running instance of eclecticiq-platform-backend-searchindex. It exchanges data with Elasticsearch and Redis. To restart the worker: systemctl restart eclecticiq-platform-backend-searchindex To run the command manually: eiq-platform search run-indexer Redis acts as a message broker:
|
Celery workers |
There are several Celery workers running concurrently. They execute tasks related to processes such as ingestion, dissemination, and discovery. They also manage execution priority for rules, data retention policies, and enrichers. Celery queues manage workload and task distribution for the workers: |
Core dependencies
Nginx
Redis
PostgreSQL
Neo4J
Elasticsearch
Logstash
Kibana