Update the settings
The /etc/eclecticiq/platform_settings.py file allows you to set advanced configuration parameters for your Intelligence Center instance.
Once you’ve installed the Intelligence Center, you should review the /etc/eclecticiq/platfrom_settings.py file to make sure the parameters set suit your environment.
For changes to platfrom_settings.py to take effect, you must restart Intelligence Center services. See Restart services for changes to take effect.
Contents
Restart services for changes to take effect
After making changes to platform_settings.py, you must restart the backend services for your changes to take effect.
To restart the services, run as root:
systemctl restart eclecticiq-platform-backend-services
Secrets
Set secret key and session token
When you install, update or reinstall the Intelligence Center, change the following default values in the platform_settings.py configuration file:
Attribute name |
Default value |
Description |
SECRET_KEY |
'' |
Required. Must be set for the Intelligence Center to start. Set this to a random string at least 32 characters long, enclosed within quotes ("" or ''). |
You can use /dev/urandom to generate such a random string. Run:
cat
/dev/urandom
|
tr
-
dc
a-zA-Z0-9[:punct:] |
fold
-w 32 |
head
-n 1
Set reset password link expiration
Attribute name |
Default value |
Description |
ONE_TIME_PASSWORD_EXPIRATION_MINUTES |
60 |
Time for password reset link to expire, in minutes. |
To reset and to change their password, click Reset password on the sign-in page.
the Intelligence Center sends you an automatic email message with a link to a password reset page, where they can complete the operation.
By default, the password reset link in the automatic email expires 60 minutes after sending the message.
Data retention and policies
Enable observable actions in policies
Data retention policies allow you to set policies that perform a Delete observables action when they run.
This is disabled by default from 2.12.0. Policies will skip any Delete observables actions that are set, while the rest of the policy runs as configured.
Attribute name |
Default value |
Description |
DISABLE_OBSERVABLE_RETENTION_POLICIES |
True |
(Not recommended) Set to False to allow policies to run Delete observables actions. May cause high resource consumption. |
Prune old records in database
You can configure the IC to remove old records from blob and content_block tables in the PostgreSQL database after a given number of days.
The blob table stores raw data downloaded by incoming feeds that is subsequently processed.
The content_block table stores data that is distributed through outgoing feeds.
By default, the data in these tables is kept forever.
To set the IC to delete records from these tables after a given number of days, set the following parameters:
Attribute name |
Default value |
Description |
CONTENT_BLOCK_RETENTION |
None |
Number of days to retain records in the content_block table. |
BLOB_RETENTION |
None |
Number of days to retain records in the blob table. |
Change data retention period of logs and metrics indices
These Elasticsearch indices are periodically pruned:
logstash*: Aggregates event log data.
statsite*: Aggregates metrics to monitor system health and performance.
Set how long data in these indices are kept by configuring these parameters:
Parameter |
Type |
Description |
LOG_RETENTION |
int |
The value represents days. Sets the total number of days Elasticsearch stores Logstash log entries, before deleting them from the corresponding log indices. Default value: 365 (1 year) |
METRIC_RETENTION |
int |
The value represents days. Sets the total number of days Elasticsearch stores Statsite metrics data entries, before deleting them from the corresponding metrics indices. Default value: 730 (2 years) |
If you find that the eiq.utilities.prune_indices task is failing with a timeout, you can extend the task’s time limit by changing the CELERY_X_EIQ_TIME_LIMIT_PER_TASK["eiq.utilities.prune_indices"] parameter:
# Time limits for specific tasks (in seconds)
CELERY_X_EIQ_TIME_LIMIT_PER_TASK
=
{
# ...
"eiq.utilities.prune_indices"
:
4
*
60
*
60
,
# 4 hours
# ...
}
Audit trail logs
Audit trail log levels are configured with AUDIT_TRAIL_LOG_LEVEL. See Audit trail logs.
Limit the number of days audit trail logs are kept by setting AUDIT_TRAIL_RETENTION. By default, this is set to 360 days (AUDIT_TRAIL_RETENTION = 360).
AUDIT_TRAIL_ENABLED is deprecated from 2.13.0 onward. Use AUDIT_TRAIL_LOG_LEVEL instead.
Data directories
Set the data directory for eclecticiq-neo4jbatcher
the Intelligence Center settings must include information to locate the data that the eclecticiq-neo4jbatcher service processes and queues up for ingestion by the Neo4J database. eclecticiq-neo4jbatcher prepares the data and temporarily stores it as CSV files.
Set these parameters in platform_settings.py
Attribute name |
Default value |
Description |
NEO4J_BATCHING_URL |
"" |
URL that the service can be accessed at. The eclecticiqiq-neo4jbatcher service:
|
GRAPH_INGESTION_SPOOL_DIRECTORY |
"" |
Location where the service can store temporary files. This must be set in two places:
|
Allow list for feed mount points
To use directories on the Intelligence Center host for incoming and outgoing feeds (such as with the “Mount point download” and “Mount point upload” transport types), you must explicitly set the directories the transport type is allowed to use.
Set these parameters:
Attribute name |
Default value |
Description |
MOUNT_POINT_POLL_ALLOWED_DIRECTORIES |
[] |
Python list of allowed paths on the Intelligence Center host that incoming feed transport types can use. Example value: [ "/mnt/" , "/media/" , "/media/data/" ] |
MOUNT_POINT_PUSH_ALLOWED_DIRECTORIES |
[] |
Python list of allowed paths on the Intelligence Center host that outgoing feed transport types can use. Example value: [ "/mnt/" , "/media/" , "/media/data/" ] |
Services
TAXII 2.1 API root service
For more information about the TAXII 2.1 API root service, see Outgoing feed - TAXII 2.1 poll.
Configure the TAXII 2.1 API root service (/taxii2/api_root/). As root:
Edit /etc/eclecticiq/platform_settings.py.
Add or change the TAXII2_API_ROOT attribute.
This table describes the possible keys and values:
Attribute name
Default
Description
TAXII2_API_ROOT
TAXII2_API_ROOT
=
{
"title"
:
"..."
,
"description"
:
"..."
,
"is_public"
:
True
,
}
Attribute that configures the TAXII 2 API root.
Keys in this dictionary are described in this table.
TAXII2_API_ROOT["title"]
"EIQ TAXII 2.1 api root"
Title assigned to the API root.
TAXII2_API_ROOT["description"]
"The EIQ TAXII 2.1 api root for passive outgoing feeds"
Description assigned to API root.
TAXII2_API_ROOT["is_public"]
True
(Recommended) Set to False to restrict access to the following endpoints:
/taxii2/api_root/
/taxii2/api_root/collections/
Users need to authenticate by sending their API key as a Bearer token, or use Basic authentication.
Save platform_settings.py.
Restart the OpenTaxii service:
systemctl restart eclecticiq
-
platform
-
backend
-
opentaxii
Elasticsearch: specify data nodes
The IC needs to know the network location of your Elasticsearch data nodes in order to query the Elasticsearch cluster.
Attribute name |
Default value |
Description |
SEARCH_URLS |
[] |
List. One or more network addresses to reach Elasticsearch data nodes at. Example value: [ "https://127.0.0.1:9200" , "https://127.0.0.1:9201" , "https://es-node-3.example.com:9200" ] |
DEPRECATED: IC 2.11 and older use the SEARCH_URL attribute instead, which only takes a single string.
New graph API and Neo4J
Release 2.12.0 switches over to a new graph API, and deprecates use of the Neo4j graph database.
By default, the IC continues to require and write to Neo4j, allowing users to fall back to Neo4j should they need to.
You can enable/disable the new graph API with the following attributes:
Attribute name |
Default value |
Description |
NEW_GRAPH_NEIGHBORHOOD_API |
True |
Set to True to enable the new graph API. |
NEW_GRAPH_API |
True |
Set to True to enable the new graph API. |
DISABLE_NEO4J |
False |
Set to False by default. By default, the IC continues writing to Neo4j even if it’s not using it as its graph database. Keeping this set to False allows users to switch between the deprecated Neo4j graph API and the new graph API. Set to True to stop writing to the Neo4j database. Setting to True is irreversible. When you set this to True and Restart services for changes to take effect, the Neo4j database falls out of sync. Do not set this back to True without re-syncing the Neo4j graph database. |
(Not recommended) If users want to switch to the new graph API completely and remove Neo4j from their systems, they must first do the following:
Set the following attributes in platform_settings.py:
NEW_GRAPH_NEIGHBORHOOD_API
=
True
NEW_GRAPH_API
=
True
DISABLE_NEO4J
=
True
After which they can remove Neo4j from their systems.
Enable Statsite in the Intelligence Center
To enable the statsite service, set the following parameters:
Attribute name |
Default value |
Description |
STATSD_ENABLED |
True |
Boolean. Set to False to disable the service. |
STATSD_HOST |
"127.0.0.1" |
IP address or fully qualified domain name the statsite service can be accessed at. |
STATSD_PORT |
8125 |
Port the statsite service is bound to on STATSD_HOST. |
Set expected Celery host count for health checks
CELERY_X_EIQ_HOSTS sets the number of Celery hosts that the Intelligence Center should expect.
Attribute name |
Default value |
Description |
CELERY_X_EIQ_HOSTS |
1 |
Number hosts running the Celery service. |
When pinging Celery to check its health status, the Intelligence Center expects to receive as many replies as the integer value assigned to CELERY_X_EIQ_HOSTS.
If you set the value to a number that is lower than the actual amount of hosts that run Celery queues, the health status check for Celery fails, and it is highlighted in red in the Intelligence Center system health pane in the GUI.
If you set the value to a number that is greater than the actual amount of hosts that run Celery queues, the health status check for Celery may take a long time, since the Intelligence Center keeps pinging and waiting for responses from non-existing hosts before it gives up.
Recommended values:
For environments with a predefined, static amount of Celery hosts, the default value of 1 is a good starting point.
It means that the Intelligence Center instance expects there to be one host running Celery queues.
For environments with a dynamic amount of Celery hosts such as Kubernetes deployments, set the value to 0 to fall back on standard timeouts.
To inspect how long it takes to run Celery checks, including health checks, from the command line run eiq-platform diagnose run.
For more information about diagnostic tests, see Run diagnostic tests.
For more information about the command line tool, see eiq-platform command line.
Ingestion and enrichment
Set package size limits
The platform_settings.py configuration file includes settings that limits the file sizes of packages handled by:
Incoming feeds
Outgoing feeds
Manual uploads
The hard limit for all file sizes is 100MB, or 100 * 1024 * 1024.
Changing these values may have a negative impact on performance.
Setting large size limits may lead to issues where tasks time out before it can finish processing the file, or an ingestion pipeline that is slowed down by tasks that have to process large files.
Attribute name |
Default value |
Description |
MAX_BLOB_SIZE |
20 * 1024 * 1024 |
This sets the maximum allowed file size for packages ingested through incoming feeds, or published through outgoing feeds. The value here is expressed in bytes, and can be a number or an expression that evaluates to a number. For example, the default value is the expression 20 * 1024 * 1024 which evaluates to 20MB. |
MAX_UPLOADED_BLOB_SIZE |
MAX_BLOB_SIZE / 2 |
This sets the maximum allowed filed size allowed for files uploaded through Manual uploads. The value here is expressed in bytes, and can be a number or an expression that evaluates to a number. We recommend leaving it at the default of MAX_BLOB_SIZE / 2, or half the maximum allowed file size set for MAX_BLOB_SIZE. |
Tune timeout limits for ingestion tasks
Avoid changing the default values for these parameters.
Changing these values may have a negative impact on performance.
If you are encountering timeouts when running certain tasks, you may want to adjust the default Celery task time limits in the CELERY_X_EIQ_TIME_LIMIT_PER_FAMILY.
# Time limits per task families (in seconds)
CELERY_X_EIQ_TIME_LIMIT_PER_FAMILY
=
{
"eiq.enrichers"
:
2
*
60
,
# 2 minutes
"eiq.incoming-transports"
:
8
*
60
*
60
,
# 8 hours
"eiq.outgoing-transports"
:
2
*
60
,
# 2 minutes
}
Parameter |
Type |
Description |
eiq.enrichers |
int |
Set a time limit in seconds. Sets time limit for each enrichment task run. |
eiq.incoming-transports |
int |
Set a time limit in seconds. Sets time limit for each incoming feed run. |
eiq.outgoing-transports |
int |
Set a time limit in seconds. Sets time limit for each outgoing feed run. |
Tune the ingestion scheduler
Avoid changing the default values for these parameters.
Changing these values may have a negative impact on performance.
Intelligence Center ingestion attempts to optimize system resources to ingest data as quickly and as efficiently as possible. It features jobs to monitor feed activity, to prioritize feeds, to distribute and rotate feed priority, as well as a ranking system to periodically assess and reassign priority to incoming feeds. The score decays over time in a similar way to entity half-life. This way, the Intelligence Center distributes the ingestion workload to minimize idle time and bottlenecks.
Tune the ingestion scheduler by changing values in the INGESTION_QUUZ_SCHEDULER_OPTIONS parameter.
INGESTION_QUUZ_SCHEDULER_OPTIONS
=
{
"prefer_full_batches"
:
True
,
"score_half_time"
:
24
*
60
*
60
,
"sweep_same_depth_max_jobs"
:
500
,
"sweep_same_depth_max_enqueued_jobs"
:
1000
,
"sweep_same_depth_max_time_total"
:
60
,
"sweep_same_depth_max_time_between"
:
30
,
}
Job depth: The “job depth” is then the number of hops in that chain of dependencies: if Job A depends on Job B, and Job B depends on Job C, then Job C has a depth of 2.
Jobs with the “higher” depth are more likely, but not guaranteed, to be scheduled to run first: if Job A has a depth of 1, and Job B has a depth of 10, then Job B is more likely to run before Job A.
If one or more jobs depend on the same job, then they have the same “job depth”: if Job A depends on Job B and Job C, then both Job C and Job B have a depth of 1.
The depth of a job is ‘lifted’ when all its dependencies finish running: if Job B depends on Job A, and Job A finishes running, then Job B has a depth of 0.
Parameter |
Type |
Description |
prefer_full_batches |
boolean |
True by default. Allows the scheduler to continue with lower depth jobs after executing a partial batch, allowing the scheduler to distribute its time across jobs. |
score_half_time |
int |
Set a time limit in seconds. Ingestion tasks are assigned a priority score that decays over time, so that ingestion workers can distribute their time fairly across all running tasks. This parameter sets the time taken for a task’s priority score to be reduced by half (a half-life). |
sweep_same_depth_max_jobs |
int |
Sets the maximum number of concurrent jobs at a given job depth, before ingestion workers can move on to jobs with the highest depth value. If no jobs are available at the current depth level, the scheduler moves one level up, and it searches there for jobs that it can execute. |
sweep_same_depth_max_enqueued_jobs |
int |
Similar to sweep_same_depth_max_jobs. It sets the the maximum number of waiting jobs that can be added to a queue at a given depth level, before an ingestion worker can move on to jobs with the highest depth value. |
sweep_same_depth_max_time_total |
int |
Set a time limit in seconds. Sets the maximum mount of time workers can spend searching for jobs to execute at a given depth level, before they move on to jobs with the highest depth value. In practice, this is how long ingestion workers has for transforming packages, before moving on to process the corresponding entities. |
sweep_same_depth_max_time_between |
int |
Set a time limit in seconds. This limits the time spent looking for the next job to run at the a given depth. If the scheduler takes longer than the set time to find another job at this depth to run, it resets and tries again. Meanwhile, depths assigned to existing jobs may have changed, allowing the scheduler to sweep through a fresh list of jobs at that depth. |
Tune task batch size in the ingestion pipeline
Avoid changing the default values for these parameters.
Changing these values may have a negative impact on performance.
Intelligence Center ingestion tasks run and are processed in batches to optimize system resources. The ingestion process works like a pipeline, where different tasks perform specific actions.
Based on system resources and the type of content being ingested, you may want to increase or decrease the batch size to improve overall ingestion performance.
Change the values in the INGESTION_TASK_BATCH_SIZES parameter to control task batch sizes:
# Batch sizes used for ingestion tasks.
INGESTION_TASK_BATCH_SIZES
=
{
"ingest_blob_task"
:
100
,
"index_extracts_task"
:
1000
,
"graph_synchronize_enrichment_task"
:
1000
,
"search_synchronize_entity_task"
:
1000
,
"graph_synchronize_entity_task"
:
1000
,
}
Parameter |
Type |
Description |
ingest_blob_task |
int |
Sets a limit to the maximum number of concurrent tasks that initiate ingesting packages from incoming feeds and enrichers. Default limit: 100 |
index_extracts_task |
int |
Sets a limit to the maximum number of concurrent tasks that index ingested observables. Default limit: 1000 |
graph_synchronize_enrichment_task |
int |
Sets a limit to the maximum number of concurrent tasks that sync ingested and indexed enrichments between PostgreSQL and Neo4j. Default limit: 1000 |
search_synchronize_entity_task |
int |
Sets a limit to the maximum number of concurrent tasks that sync ingested entities between PostgreSQL and Elasticsearch. Default limit: 1000 |
graph_synchronize_entity_task |
int |
Sets a limit to the maximum number of concurrent tasks that sync ingested and indexed entities between PostgreSQL and Neo4j. Default limit: 1000 |
Automatically disable enricher after continuously failing
By default, enrichers can fail up to ten times in a row before they are automatically disabled.
Change this by setting these parameters:
Attribute name |
Default value |
Description |
ENRICHER_FAILURES_TO_DISABLE |
10 |
Number of times an enricher can fail in a row before it is automatically disabled. |
Miscellaneous
Clean up Intelligence Center notifications
Changing default values for parameters listed in this section may cause the Intelligence Center to behave unexpectedly.
Set the maximum number of notifications retained by the Intelligence Center for each user.
Attribute name |
Default value |
Description |
NOTIFICATIONS_CLEAN_UP_EVERY |
1000 |
Set the maximum number of notifications retained for each user. Once this number is reached, older notifications are discarded. |
Sample platform_settings.py file
The following example serves as a guideline: