Update the settings

The /etc/eclecticiq/platform_settings.py file allows you to set advanced configuration parameters for your Intelligence Center instance.

Once you’ve installed the Intelligence Center, you should review the /etc/eclecticiq/platfrom_settings.py file to make sure the parameters set suit your environment.

For changes to platfrom_settings.py to take effect, you must restart Intelligence Center services. See Restart services for changes to take effect.

Contents

Restart services for changes to take effect

After making changes to platform_settings.py, you must restart the backend services for your changes to take effect.

To restart the services, run as root:

systemctl restart eclecticiq-platform-backend-services

Secrets

Set secret key and session token

When you install, update or reinstall the Intelligence Center, change the following default values in the platform_settings.py configuration file:

Attribute name

Default value

Description

SECRET_KEY

''

Required. Must be set for the Intelligence Center to start.

Set this to a random string at least 32 characters long, enclosed within quotes ("" or '').

You can use /dev/urandom to generate such a random string. Run:

cat /dev/urandom | tr -dc a-zA-Z0-9[:punct:] | fold -w 32 | head -n 1

Set reset password link expiration

Attribute name

Default value

Description

ONE_TIME_PASSWORD_EXPIRATION_MINUTES

60

Time for password reset link to expire, in minutes.

To reset and to change their password, click Reset password on the sign-in page.

the Intelligence Center sends you an automatic email message with a link to a password reset page, where they can complete the operation.

By default, the password reset link in the automatic email expires 60 minutes after sending the message.

Data retention and policies

Enable observable actions in policies

Data retention policies allow you to set policies that perform a Delete observables action when they run.

This is disabled by default from 2.12.0. Policies will skip any Delete observables actions that are set, while the rest of the policy runs as configured.

Attribute name

Default value

Description

DISABLE_OBSERVABLE_RETENTION_POLICIES

True

(Not recommended) Set to False to allow policies to run Delete observables actions.

May cause high resource consumption.

Prune old records in database

You can configure the IC to remove old records from blob and content_block tables in the PostgreSQL database after a given number of days.

  • The blob table stores raw data downloaded by incoming feeds that is subsequently processed.

  • The content_block table stores data that is distributed through outgoing feeds.

By default, the data in these tables is kept forever.

To set the IC to delete records from these tables after a given number of days, set the following parameters:

Attribute name

Default value

Description

CONTENT_BLOCK_RETENTION

None

Number of days to retain records in the content_block table.

BLOB_RETENTION

None

Number of days to retain records in the blob table.

Change data retention period of logs and metrics indices

These Elasticsearch indices are periodically pruned:

  • logstash*: Aggregates event log data.

  • statsite*: Aggregates metrics to monitor system health and performance.

Set how long data in these indices are kept by configuring these parameters:

Parameter

Type

Description

LOG_RETENTION

int

The value represents days.

Sets the total number of days Elasticsearch stores Logstash log entries, before deleting them from the corresponding log indices.

Default value: 365 (1 year)

METRIC_RETENTION

int

The value represents days.

Sets the total number of days Elasticsearch stores Statsite metrics data entries, before deleting them from the corresponding metrics indices.

Default value: 730 (2 years)

If you find that the eiq.utilities.prune_indices task is failing with a timeout, you can extend the task’s time limit by changing the CELERY_X_EIQ_TIME_LIMIT_PER_TASK["eiq.utilities.prune_indices"] parameter:

# Time limits for specific tasks (in seconds)
CELERY_X_EIQ_TIME_LIMIT_PER_TASK = {
# ...
"eiq.utilities.prune_indices": 4 * 60 * 60, # 4 hours
# ...
}

Data directories

Set the data directory for eclecticiq-neo4jbatcher

the Intelligence Center settings must include information to locate the data that the eclecticiq-neo4jbatcher service processes and queues up for ingestion by the Neo4J database. eclecticiq-neo4jbatcher prepares the data and temporarily stores it as CSV files.

Set these parameters in platform_settings.py

Attribute name

Default value

Description

NEO4J_BATCHING_URL

""

URL that the service can be accessed at.

The eclecticiqiq-neo4jbatcher service:

  • is usually bound to port 4008. Include the port when setting this parameter: "http://${n4jb_host_url}:4008"

  • may require Basic Authentication credentials. Include your user name and password when seeting this parameter: "http://${n4jb_username}:${n4jb_password}@${n4jb_host_url}:4008"

    Example value:

    "http://user:[email protected]:4008"

GRAPH_INGESTION_SPOOL_DIRECTORY

""

Location where the service can store temporary files.

This must be set in two places:

  1. Export it as the NEO4JB_STORAGE_DIR environment variable:

    export NEO4JB_STORAGE_DIR="/var/lib/neo4j/import/tmp/neo4j-batcher"
  2. Set it in platform_settings.py:

    GRAPH_INGESTION_SPOOL_DIRECTORY="/var/lib/neo4j/import/tmp/neo4j-batcher"

Allow list for feed mount points

To use directories on the Intelligence Center host for incoming and outgoing feeds (such as with the “Mount point download” and “Mount point upload” transport types), you must explicitly set the directories the transport type is allowed to use.

Set these parameters:

Attribute name

Default value

Description

MOUNT_POINT_POLL_ALLOWED_DIRECTORIES

[]

Python list of allowed paths on the Intelligence Center host that incoming feed transport types can use.

Example value:

[ "/mnt/", "/media/", "/media/data/" ]

MOUNT_POINT_PUSH_ALLOWED_DIRECTORIES

[]

Python list of allowed paths on the Intelligence Center host that outgoing feed transport types can use.

Example value:

[ "/mnt/", "/media/", "/media/data/" ]

Services

Elasticsearch: specify data nodes

The IC needs to know the network location of your Elasticsearch data nodes in order to query the Elasticsearch cluster.

Attribute name

Default value

Description

SEARCH_URLS

[]

List. One or more network addresses to reach Elasticsearch data nodes at.

Example value:

[ "https://127.0.0.1:9200", "https://127.0.0.1:9201", "https://es-node-3.example.com:9200" ]

IC 2.11 and older use the SEARCH_URL attribute instead, which only takes a single string.

New graph API and Neo4J

Release 2.12.0 switches over to a new graph API, and deprecates use of the Neo4j graph database.

By default, the IC continues to require and write to Neo4j, allowing users to fall back to Neo4j should they need to.

You can enable/disable the new graph API with the following attributes:

Attribute name

Default value

Description

NEW_GRAPH_NEIGHBORHOOD_API

True

Set to True to enable the new graph API.

NEW_GRAPH_API

True

Set to True to enable the new graph API.

DISABLE_NEO4J

False

Set to False by default.

By default, the IC continues writing to Neo4j even if it’s not using it as its graph database. Keeping this set to False allows users to switch between the deprecated Neo4j graph API and the new graph API.

Set to True to stop writing to the Neo4j database.

Setting to True is irreversible.

When you set this to True and Restart services for changes to take effect, the Neo4j database falls out of sync. Do not set this back to True without re-syncing the Neo4j graph database.

(Not recommended) If users want to switch to the new graph API completely and remove Neo4j from their systems, they must first do the following:

  1. Set the following attributes in platform_settings.py:

    NEW_GRAPH_NEIGHBORHOOD_API=True
    NEW_GRAPH_API=True
    DISABLE_NEO4J=True
  2. Restart services for changes to take effect

After which they can remove Neo4j from their systems.

Enable Statsite in the Intelligence Center

To enable the statsite service, set the following parameters:

Attribute name

Default value

Description

STATSD_ENABLED

True

Boolean. Set to False to disable the service.

STATSD_HOST

"127.0.0.1"

IP address or fully qualified domain name the statsite service can be accessed at.

STATSD_PORT

8125

Port the statsite service is bound to on STATSD_HOST.

Set expected Celery host count for health checks

CELERY_X_EIQ_HOSTS sets the number of Celery hosts that the Intelligence Center should expect.

Attribute name

Default value

Description

CELERY_X_EIQ_HOSTS

1

Number hosts running the Celery service.

When pinging Celery to check its health status, the Intelligence Center expects to receive as many replies as the integer value assigned to CELERY_X_EIQ_HOSTS.

  • If you set the value to a number that is lower than the actual amount of hosts that run Celery queues, the health status check for Celery fails, and it is highlighted in red in the Intelligence Center system health pane in the GUI.

  • If you set the value to a number that is greater than the actual amount of hosts that run Celery queues, the health status check for Celery may take a long time, since the Intelligence Center keeps pinging and waiting for responses from non-existing hosts before it gives up.

Recommended values:

  • For environments with a predefined, static amount of Celery hosts, the default value of 1 is a good starting point.

    It means that the Intelligence Center instance expects there to be one host running Celery queues.

  • For environments with a dynamic amount of Celery hosts such as Kubernetes deployments, set the value to 0 to fall back on standard timeouts.

To inspect how long it takes to run Celery checks, including health checks, from the command line run eiq-platform diagnose run.

Ingestion and enrichment

Set package size limits

The platform_settings.py configuration file includes settings that limits the file sizes of packages handled by:

  • Incoming feeds

  • Outgoing feeds

  • Manual uploads

The hard limit for all file sizes is 100MB, or 100 * 1024 * 1024.

Changing these values may have a negative impact on performance.

Setting large size limits may lead to issues where tasks time out before it can finish processing the file, or an ingestion pipeline that is slowed down by tasks that have to process large files.

Attribute name

Default value

Description

MAX_BLOB_SIZE

20 * 1024 * 1024

This sets the maximum allowed file size for packages ingested through incoming feeds, or published through outgoing feeds.

The value here is expressed in bytes, and can be a number or an expression that evaluates to a number.

For example, the default value is the expression 20 * 1024 * 1024 which evaluates to 20MB.

MAX_UPLOADED_BLOB_SIZE

MAX_BLOB_SIZE / 2

This sets the maximum allowed filed size allowed for files uploaded through Manual uploads.

The value here is expressed in bytes, and can be a number or an expression that evaluates to a number.

We recommend leaving it at the default of MAX_BLOB_SIZE / 2, or half the maximum allowed file size set for MAX_BLOB_SIZE.

Tune timeout limits for ingestion tasks

Avoid changing the default values for these parameters.

Changing these values may have a negative impact on performance.

If you are encountering timeouts when running certain tasks, you may want to adjust the default Celery task time limits in the CELERY_X_EIQ_TIME_LIMIT_PER_FAMILY.

# Time limits per task families (in seconds)
CELERY_X_EIQ_TIME_LIMIT_PER_FAMILY = {
"eiq.enrichers": 2 * 60, # 2 minutes
"eiq.incoming-transports": 8 * 60 * 60, # 8 hours
"eiq.outgoing-transports": 2 * 60, # 2 minutes
}

Parameter

Type

Description

eiq.enrichers

int

Set a time limit in seconds.

Sets time limit for each enrichment task run.

eiq.incoming-transports

int

Set a time limit in seconds.

Sets time limit for each incoming feed run.

eiq.outgoing-transports

int

Set a time limit in seconds.

Sets time limit for each outgoing feed run.

Tune the ingestion scheduler

Avoid changing the default values for these parameters.

Changing these values may have a negative impact on performance.

Intelligence Center ingestion attempts to optimize system resources to ingest data as quickly and as efficiently as possible. It features jobs to monitor feed activity, to prioritize feeds, to distribute and rotate feed priority, as well as a ranking system to periodically assess and reassign priority to incoming feeds. The score decays over time in a similar way to entity half-life. This way, the Intelligence Center distributes the ingestion workload to minimize idle time and bottlenecks.

Tune the ingestion scheduler by changing values in the INGESTION_QUUZ_SCHEDULER_OPTIONS parameter.

INGESTION_QUUZ_SCHEDULER_OPTIONS = {
"prefer_full_batches": True,
"score_half_time": 24 * 60 * 60,
"sweep_same_depth_max_jobs": 500,
"sweep_same_depth_max_enqueued_jobs": 1000,
"sweep_same_depth_max_time_total": 60,
"sweep_same_depth_max_time_between": 30,
}

Job depth: The “job depth” is then the number of hops in that chain of dependencies: if Job A depends on Job B, and Job B depends on Job C, then Job C has a depth of 2.

Jobs with the “higher” depth are more likely, but not guaranteed, to be scheduled to run first: if Job A has a depth of 1, and Job B has a depth of 10, then Job B is more likely to run before Job A.

If one or more jobs depend on the same job, then they have the same “job depth”: if Job A depends on Job B and Job C, then both Job C and Job B have a depth of 1.

The depth of a job is ‘lifted’ when all its dependencies finish running: if Job B depends on Job A, and Job A finishes running, then Job B has a depth of 0.

Parameter

Type

Description

prefer_full_batches

boolean

True by default.

Allows the scheduler to continue with lower depth jobs after executing a partial batch, allowing the scheduler to distribute its time across jobs.

score_half_time

int

Set a time limit in seconds.

Ingestion tasks are assigned a priority score that decays over time, so that ingestion workers can distribute their time fairly across all running tasks.

This parameter sets the time taken for a task’s priority score to be reduced by half (a half-life).

sweep_same_depth_max_jobs

int

Sets the maximum number of concurrent jobs at a given job depth, before ingestion workers can move on to jobs with the highest depth value.

If no jobs are available at the current depth level, the scheduler moves one level up, and it searches there for jobs that it can execute.

sweep_same_depth_max_enqueued_jobs

int

Similar to sweep_same_depth_max_jobs.

It sets the the maximum number of waiting jobs that can be added to a queue at a given depth level, before an ingestion worker can move on to jobs with the highest depth value.

sweep_same_depth_max_time_total

int

Set a time limit in seconds.

Sets the maximum mount of time workers can spend searching for jobs to execute at a given depth level, before they move on to jobs with the highest depth value.

In practice, this is how long ingestion workers has for transforming packages, before moving on to process the corresponding entities.

sweep_same_depth_max_time_between

int

Set a time limit in seconds.

This limits the time spent looking for the next job to run at the a given depth. If the scheduler takes longer than the set time to find another job at this depth to run, it resets and tries again.

Meanwhile, depths assigned to existing jobs may have changed, allowing the scheduler to sweep through a fresh list of jobs at that depth.

Tune task batch size in the ingestion pipeline

Avoid changing the default values for these parameters.

Changing these values may have a negative impact on performance.

Intelligence Center ingestion tasks run and are processed in batches to optimize system resources. The ingestion process works like a pipeline, where different tasks perform specific actions.

Based on system resources and the type of content being ingested, you may want to increase or decrease the batch size to improve overall ingestion performance.

Change the values in the INGESTION_TASK_BATCH_SIZES parameter to control task batch sizes:

# Batch sizes used for ingestion tasks.
INGESTION_TASK_BATCH_SIZES = {
"ingest_blob_task": 100,
"index_extracts_task": 1000,
"graph_synchronize_enrichment_task": 1000,
"search_synchronize_entity_task": 1000,
"graph_synchronize_entity_task": 1000,
}

Parameter

Type

Description

ingest_blob_task

int

Sets a limit to the maximum number of concurrent tasks that initiate ingesting packages from incoming feeds and enrichers.

Default limit: 100

index_extracts_task

int

Sets a limit to the maximum number of concurrent tasks that index ingested observables.

Default limit: 1000

graph_synchronize_enrichment_task

int

Sets a limit to the maximum number of concurrent tasks that sync ingested and indexed enrichments between PostgreSQL and Neo4j.

Default limit: 1000

search_synchronize_entity_task

int

Sets a limit to the maximum number of concurrent tasks that sync ingested entities between PostgreSQL and Elasticsearch.

Default limit: 1000

graph_synchronize_entity_task

int

Sets a limit to the maximum number of concurrent tasks that sync ingested and indexed entities between PostgreSQL and Neo4j.

Default limit: 1000

Automatically disable enricher after continuously failing

By default, enrichers can fail up to ten times in a row before they are automatically disabled.

Change this by setting these parameters:

Attribute name

Default value

Description

ENRICHER_FAILURES_TO_DISABLE

10

Number of times an enricher can fail in a row before it is automatically disabled.

Miscellaneous

Clean up Intelligence Center notifications

Changing default values for parameters listed in this section may cause the Intelligence Center to behave unexpectedly.

Set the maximum number of notifications retained by the Intelligence Center for each user.

Attribute name

Default value

Description

NOTIFICATIONS_CLEAN_UP_EVERY

1000

Set the maximum number of notifications retained for each user.

Once this number is reached, older notifications are discarded.

Sample platform_settings.py file

The following example serves as a guideline:

settings.py (Sourced from EIQ platform-backend)

"""
Default settings module.
 
These are just some application defaults and will be overridden by
a custom settings file.
"""
 
import os
from typing import Mapping, Optional, Sequence
 
 
# Note: ensure that all settings used by the applications are listed here with
# a default value. This makes it easier to see which settings are available.
 
 
#
# Flask web app
#
 
# Application setup (do not override)
APPLICATION_PREFIX = "/private"
APPLICATION_PREFIX_PUBLIC = "/api"
GENERIC_ERROR_HANDLING = True
 
# Static content
PLATFORM_STATIC_FOLDER = "/opt/eclecticiq-platform-docs"
PLATFORM_SWAGGER_INDEX = os.path.join(PLATFORM_STATIC_FOLDER, "swagger/index.html")
PLATFORM_DOCUMENTATION_INDEX = os.path.join(
PLATFORM_STATIC_FOLDER, "documentation/index.html"
)
 
 
#
# Logging
#
 
LOG_FORMAT = "json"
LOG_LEVEL = "warning,eiq:info,quuz:warning"
 
SENTRY_DSN = None
 
# Whether to store audit trails for all user requests in the database.
AUDIT_TRAIL_ENABLED = True
 
# default tags for statsd metrics
METRIC_TAGS: Mapping[str, str] = {}
 
# TP47152. Maximum value for gauge metric of pending blobs.
BLOB_QUEUE_METRIC_MAX_VALUE = 100
 
#
# PostgreSQL / SQLAlchemy
#
 
# URI to the database.
SQLALCHEMY_DATABASE_URI = ""
 
# Flask-SQLAlchemy configuration (do not override)
SQLALCHEMY_TRACK_MODIFICATIONS = True
 
# it is possible to use any of the libpd supported connection parameters.
# https://docs.sqlalchemy.org/en/13/core/engines.html#sqlalchemy.create_engine
# https://www.postgresql.org/docs/10/libpq-connect.html#LIBPQ-PARAMKEYWORDS
SQLALCHEMY_ENGINE_OPTIONS = {
"pool_recycle": 120,
"connect_args": {"connect_timeout": 25},
}
 
#
# Elasticsearch
#
 
# Format: ["host[:port]","host[:port]"]
SEARCH_URLS: Sequence[str] = []
 
ELASTICSEARCH_QUERY_TIMEOUT = "20s"
ELASTICSEARCH_SHARDS_NUMBER = 1
ELASTICSEARCH_REPLICAS_NUMBER = 1
 
# enable new graph neighborhood api implementation (beta feature)
NEW_GRAPH_NEIGHBORHOOD_API = True
# enable new graph api implementation (beta feature)
NEW_GRAPH_API = True
# Stop writing to Neo4j - Non-reversible operation as DB will be unsync
# NEW_GRAPH_API should be set as True
DISABLE_NEO4J = False
 
# Periodic Celery tasks configuration:
# How many days we store `logstash` indices in ES:
LOG_RETENTION = 365
# How many days we store `statsite` indices in ES:
METRIC_RETENTION = 730
# How many days we should keep `content_blocks` in PG. `None` means keep it forever.
# This is disabled by default
CONTENT_BLOCK_RETENTION = None
# How many days we should keep `blobs` in the database. `None` means keep it forever.
# This is disabled by default
BLOB_RETENTION = None
 
#
# Neo4j and neo4j-batcher
#
 
NEO4J_URL = ""
NEO4J_SSL_VERIFY = False
NEO4J_USER = ""
NEO4J_PASSWORD = ""
NEO4J_DEBUG = False
 
# now neo4j-batcher lives in a separate repo,
# but we still want to know it's URL to make health checks.
NEO4J_BATCHING_URL = ""
 
GRAPH_INGESTION_SPOOL_DIRECTORY = ""
# Whether to send items for graph ingestion directly or through Redis
SYNCHRONOUS_GRAPH_INGESTION = True
 
 
#
# Redis
#
 
REDIS_URL = "redis://localhost/0"
 
# Backup (Redis) queue names
SEARCH_QUEUE_NAME = "queue:search:inbound"
GRAPH_QUEUE_NAME = "queue:graph:inbound"
 
 
#
# Statsd
#
 
STATSD_ENABLED = True
STATSD_HOST = "127.0.0.1"
STATSD_PORT = 8125
 
#
# Kibana
#
 
KIBANA_URL = ""
 
 
#
# OpenTAXII
#
 
OPENTAXII_INTERNAL_URL = "http://localhost:8009/"
 
 
#
# Email
#
 
SMTP_HOST = "localhost"
SMTP_PORT = 25
SMTP_USERNAME: Optional[str] = None
SMTP_PASSWORD: Optional[str] = None
SMTP_ENCRYPTION: Optional[str] = None # None, 'tls', 'ssl'
SMTP_TIMEOUT = 30
 
 
#
# Celery (background tasks)
#
 
CELERY_BROKER_URL = REDIS_URL
CELERY_BROKER_TRANSPORT_OPTIONS = {
# Big `socket_timeout` makes celery diagnostics slow.
# Small `socket_timeout` can fail too soon during high load on broker.
"socket_timeout": 10,
"socket_connect_timeout": 30,
}
 
CELERY_ACCEPT_CONTENT = ["json"]
CELERY_ENABLE_UTC = True
CELERY_WORKER_REDIRECT_STDOUTS_LEVEL = "DEBUG"
CELERY_RESULT_BACKEND = REDIS_URL
CELERY_RESULT_SERIALIZER = "json"
CELERY_TASK_STORE_ERRORS_EVEN_IF_IGNORED = True
CELERY_TASK_PROTOCOL = 2
CELERY_TASK_SERIALIZER = "json"
CELERY_TASK_TRACK_STARTED = True
CELERY_SEND_SENT_EVENT = True
CELERY_SEND_EVENTS = True
 
CELERY_TASK_QUEUES = [
{"name": "enrichers", "routing_key": "eiq.enrichers.#"},
{"name": "enrichers-priority", "routing_key": "priority.enrichers.#"},
{"name": "incoming-transports", "routing_key": "eiq.incoming-transports.#"},
{
"name": "incoming-transports-priority",
"routing_key": "priority.incoming-transports.#",
},
{"name": "outgoing-transports", "routing_key": "eiq.outgoing-transports.#"},
{
"name": "outgoing-transports-priority",
"routing_key": "priority.outgoing-transports.#",
},
{"name": "outgoing-feeds", "routing_key": "eiq.outgoing-feeds.#"},
{"name": "outgoing-feeds-priority", "routing_key": "priority.outgoing-feeds.#"},
{"name": "utilities", "routing_key": "eiq.utilities.#"},
{"name": "utilities-priority", "routing_key": "priority.utilities.#"},
{"name": "discovery", "routing_key": "eiq.discovery.#"},
{"name": "discovery-priority", "routing_key": "priority.discovery.#"},
{"name": "entity-rules-priority", "routing_key": "priority.entity-rules.#"},
{"name": "extract-rules-priority", "routing_key": "priority.extract-rules.#"},
{"name": "reindexing", "routing_key": "eiq.reindexing.#"},
{"name": "retention-policies", "routing_key": "eiq.retention-policies.#"},
{"name": "synchronization", "routing_key": "eiq.synchronization.#"},
{
"name": "retention-policies-priority",
"routing_key": "priority.retention-policies.#",
},
]
 
CELERY_TASK_DEFAULT_QUEUE = "default"
CELERY_TASK_DEFAULT_EXCHANGE_TYPE = "direct"
CELERY_TASK_DEFAULT_ROUTING_KEY = "default"
CELERY_TASK_ROUTES = ("eiq.platform.taskrunner.routing.TaskRouter",)
 
CELERY_TASK_ACKS_LATE = True
CELERY_TASK_ACKS_ON_FAILURE_OR_TIMEOUT = True
CELERY_TASK_REJECT_ON_WORKER_LOST = False
 
 
# Count of hosts that run celery queues.
# Specifies how many replies health check will expect.
# If the number is smaller than actual value, celery health check will fail.
# If the number is bigger, celery health check will be noticeably longer.
# So, if you're unsure, leave it `1` and run `eiq-platform diagnose run`.
# For env with dynamic hosts count like k8s set this to `0`.
CELERY_X_EIQ_HOSTS = 1
 
# Time limits per task families (in seconds)
CELERY_X_EIQ_TIME_LIMIT_PER_FAMILY = {
"eiq.enrichers": 2 * 60, # 2 minutes
"eiq.incoming-transports": 8 * 60 * 60, # 8 hours
"eiq.outgoing-transports": 2 * 60, # 2 minutes
"eiq.discovery": 10 * 60, # 10 minutes
"eiq.outgoing-feeds": 85 * 60 * 60, # 85 hours
"eiq.reindexing": 2 * 60 * 60, # 2 hours
"eiq.retention-policies": 120 * 60 * 60, # 120 hours
"eiq.utilities": 2 * 60 * 60, # 2 hours
}
 
# Time limits for specific tasks (in seconds)
CELERY_X_EIQ_TIME_LIMIT_PER_TASK = {
"eiq.utilities.update_taxonomy_in_search": 4 * 60 * 60, # 4 hours
"eiq.utilities.delete_taxonomy_in_search": 4 * 60 * 60, # 4 hours
"eiq.utilities.send_email": 30,
"eiq.utilities.taxii_discovery": 60,
"eiq.utilities.postponed_entity_signals": 60,
"eiq.utilities.create_notifications": 60,
"eiq.utilities.prune_indices": 4 * 60 * 60, # 4 hours
"eiq.discovery.search_discovery": 10 * 60, # 10 minutes
"eiq.discovery.delete_discovery": 60,
"eiq.entity-rules.entity_rule_task": 48 * 60 * 60, # 48 hours
"eiq.extract-rules.extract_rule_task": 4 * 60 * 60, # 4 hours
"eiq.synchronization.reindex_batch": 2 * 60 * 60, # 2 hours
}
 
CELERY_WORKER_HIJACK_ROOT_LOGGER = False
CELERY_WORKER_LOG_FORMAT = "%(message)s"
CELERY_WORKER_TASK_LOG_FORMAT = "%(message)s"
 
CELERY_BEAT_SCHEDULER = "eiq.platform.taskrunner.scheduler.DynamicScheduler"
 
# How long to store results of volatile tasks (in seconds)
CELERY_X_EIQ_VOLATILE_TASK_RESULT_DEFAULT_TTL = 180
 
# Indicates if potentially lost tasks (with persistent envelope) should
# be marked as failed only after their soft time limit has been exceeded.
CELERY_X_EIQ_UPDATE_PERSISTENT_TASK_WITHIN_TIME_LIMIT = False
 
#
# Application security and network access policies
#
 
SECRET_KEY = ""
JWT_AUTH_ENDPOINT = "auth"
JWT_EXPIRATION_DELTA = 60 * 30 # 30 minutes
ONE_TIME_PASSWORD_EXPIRATION_MINUTES = 60
LOGIN_ATTEMPT_TOO_FAST_SECONDS = 1
MAX_RESET_PASSWORD_PER_DAY = 3
 
# Multi-factor authentication master password;
# must be produced by Fernet.generate_key().
MFA_MASTER_PASSWORD = ""
 
# Whether to expose the OpenAPI specs.
EXPOSE_OPENAPI = False
 
# Custom CA bundle to use for outgoing HTTP requests (optional)
REQUESTS_CA_BUNDLE: Optional[str] = None
 
# Path to the file containing the proxy url
PROXY_URL_FILE_PATH = "/etc/eclecticiq/proxy_url"
 
# Network address block filtering. This is a list of address blocks that the
# user may not configure the platform to make requests to.
#
# The full list of private networks, consider adding them all:
# [
# '0.0.0.0/8', # broadcast network
# '10.0.0.0/8',
# '127.0.0.0/8',
# '169.254.0.0/16', # link-local
# '172.16.0.0/12',
# '192.0.0.0/29',
# '192.0.0.170/31',
# '192.0.2.0/24',
# '192.168.0.0/16',
# '198.18.0.0/15',
# '198.51.100.0/24',
# '203.0.113.0/24',
# '240.0.0.0/4',
# '255.255.255.255/32',
#
# '::1/128',
# '::/128',
# '::ffff:0:0/96',
# '100::/64',
# '2001::/23',
# '2001:2::/48',
# '2001:db8::/32',
# '2001:10::/28',
# 'fc00::/7', # unique local addresses
# 'fe80::/10', # link-local addresses
# ]
#
# More details: https://en.wikipedia.org/wiki/Private_network
USER_CIDR_BLACKLIST = ["0.0.0.0/8", "127.0.0.0/8", "::1/128"]
 
 
#
# STIX
#
 
STIX_DEFAULT_NAMESPACE_PREFIX = "not-yet-configured"
STIX_DEFAULT_NAMESPACE_URI = "http://not-yet-configured.example.org/"
 
 
#
# Half-life
#
 
# Default values
HALF_LIFE = {
"campaign": 1000,
"course-of-action": 182,
"eclecticiq-sighting": 182,
"exploit-target": 182,
"incident": 182,
"indicator": 30,
"report": 182,
"threat-actor": 1000,
"ttp": 720,
}
 
 
#
# Discovery
#
 
DISCOVERY_SEARCH_PAGE_SIZE = 2500
DISCOVERY_RESULTS_LIMIT = 2500
DISCOVERY_WORKSPACE_SIZE_LIMIT = 2500
 
 
#
# Ingestion
#
 
# Package size limits (in bytes)
MAX_BLOB_SIZE = 20 * 1024 * 1024
MAX_UPLOADED_BLOB_SIZE = MAX_BLOB_SIZE / 2
 
INGESTION_FEED_PRIORITY = 100
INGESTION_UPLOAD_PRIORITY = 1000
INGESTION_BLACKLIST_IDREFS: Sequence[str] = []
INGESTION_FINALIZER_DEPTH = 100
 
# retry settings for only DB-dependent tasks
# this results in roughly 6 hours worth of retrying a task in sum:
# for i in range(max_attempts):
# s += min(max_retry_delay, retry_delay * 2 ** i)
 
INGESTION_TASK_RETRY_SETTINGS = {
"max_attempts": 10,
"retry_delay": 60, # one minute
"max_retry_delay": 60 * 90, # 90 minutes
}
 
# retry settings for external storage dependent tasks
# this results in roughly a week worth of retrying a task in sum
INGESTION_EXTERNAL_TASK_RETRY_SETTINGS = {
"max_attempts": 25,
"retry_delay": 60 * 5, # 5 minutes
"max_retry_delay": 60 * 60 * 12, # 12 hours
}
 
# Batch sizes used for ingestion tasks.
INGESTION_TASK_BATCH_SIZES = {
"ingest_blob_task": 100,
"index_extracts_task": 1000,
"graph_synchronize_enrichment_task": 1000,
"search_synchronize_entity_task": 1000,
"graph_synchronize_entity_task": 1000,
}
 
# Quuz scheduler options for the ingestion task runner.
INGESTION_QUUZ_SCHEDULER_OPTIONS = {
# Continue with lower depth jobs after executing a partial batch,
# which should increase the effectiveness of batch tasks.
"prefer_full_batches": True,
# The ‘score half time’ influences for how long a long-running worker
# process should remember how much time was spent on a certain feed.
# Workers divide their time fairly over all feeds, but a small amount of
# (slow) outliers in a feed, e.g. due to timeouts, should not result in
# that feed getting ‘punished’ forever. The value is the number of seconds
# after which the score assigned to a feed loses half its value. This
# should be up to a few orders of magnitude higher than the run time of the
# slowest jobs.
"score_half_time": 24 * 60 * 60, # 24 hours
# How many similar jobs to execute in a row before switching to jobs with
# the highest depth, e.g. how many blobs to transform before starting
# to process their entities.
"sweep_same_depth_max_jobs": 500,
# Similar to ‘sweep_same_depth_max_jobs’, but defines how many jobs can be
# enqueued (not run) before a worker moves to higher depth jobs again.
"sweep_same_depth_max_enqueued_jobs": 1000,
# How many seconds to execute similar jobs in a row before switching to
# jobs with the highest depth, e.g. for how long to keep transforming blobs
# before processing their entities.
"sweep_same_depth_max_time_total": 60,
# If a feed didn't get a chance to run for this amount of time (seconds),
# interrupt what it was doing and pick the highest depth jobs instead.
"sweep_same_depth_max_time_between": 30,
}
 
# Observable source ACL caching. This can improve performance of some ingestion
# scenarios, at the cost of seeing potentially outdated data.
SOURCES_ACL_REDIS_CACHE_ENABLED = False
 
 
#
# Incoming/outgoing feeds
#
 
# Enable caching of package counts (pending, failed, ingested) used in the
# incoming feed view. '0' means caching is disabled. A good value when
# enabling caching is 120 seconds or higher.
INCOMING_FEED_BLOB_STATS_CACHE_IN_SECONDS = 0
 
# Directories that can be accessed from mount point feeds. POLL is for incoming
# feeds, PUSH is for outgoing feeds. Example: ["/mnt/", "/media/"]
MOUNT_POINT_POLL_ALLOWED_DIRECTORIES: Sequence[str] = []
MOUNT_POINT_PUSH_ALLOWED_DIRECTORIES: Sequence[str] = []
 
# Number of entities per package for outgoing feeds
PACKING_BATCH_SIZE_ENTITIES = 25
PACKING_BATCH_SIZE_RELATIONS = 125
 
#
# Retention policies
#
 
RETENTION_ENTITY_FETCH_CHUNK_SIZE = 1024
RETENTION_ENTITY_DEL_CHUNK_SIZE = 1024
 
#
# Platform extensions (feeds, enrichers)
#
# Number of failures before disabling an enricher
ENRICHER_FAILURES_TO_DISABLE = 10
 
# List of names of extensions that should not be loaded.
DISABLED_EXTENSIONS: Sequence[str] = []
 
# Retry behaviour
FILE_DOWNLOAD_RETRIES_LIMIT = 3
FILE_DOWNLOAD_RETRY_TIMEOUT = 10 # in seconds
 
 
#
# LDAP
#
 
LDAP_AUTH_ENABLED = False
LDAP_URI = "ldaps://toolbox.iw"
LDAP_IGNORE_TLS_ERRORS = True
LDAP_BIND_DN = "cn=Manager,dc=ldap,dc=eclecticiq"
LDAP_BIND_PASSWORD = "adminpassword"
 
# These are 2-tuples of the form (base, filter_template)
LDAP_USERS_FILTER = (
"ou=Users,dc=ldap,dc=eclecticiq", # Base
"(uid={username})", # Filter template
)
LDAP_GROUPS_FILTER = (
"ou=EclecticIQGroups,dc=ldap,dc=eclecticiq",
"(&(memberUid={username})(objectClass=posixGroup))",
)
LDAP_ROLES_FILTER = (
"ou=EclecticIQRoles,dc=ldap,dc=eclecticiq",
"(&(memberUid={username})(objectClass=posixGroup))",
)
 
LDAP_USER_FIRSTNAME_ATTR = "cn"
LDAP_USER_LASTNAME_ATTR = "sn"
LDAP_USER_EMAIL_ATTR = "mail"
 
LDAP_ROLE_NAME_ATTR = "cn"
LDAP_GROUP_NAME_ATTR = "cn"
LDAP_CASE_SENSITIVE_MATCHING = True
 
LDAP_USER_IS_ADMIN_ATTR = "isEclecticIQAdmin"
LDAP_ADMIN_ROLE_GROUP_NAME = "EclecticIQAdminsGroup"
 
 
#
# SAML
#
 
SAML_AUTH_ENABLED = False
 
# SAML can be configures using a live configuration page
# When this is enabled, go to /private/saml/configure to set up SAML.
SAML_CONFIGURE_MODE = False
SAML_TEST_CONFIG_FILE = "/tmp/eiq_platform_saml_test_config.json" # nosec
 
# The IDP metadata file should be generated by the IDP and placed on
# the platform instance:
SAML_IDP_METADATA = {
"url": "https://samltest.id/saml/idp",
# "file": "/idp-metadata.xml",
}
 
# IDP EntityID must match the "EntityID" attribute in the idp metadata:
# Keycloack uses a url like 'http://localhost:8080/auth/realms/master'
# but it can be any string:
SAML_IDP_ENTITYID = "https://samltest.id/saml/idp"
 
# Optionally require users to have a minimum authorization level in order to
# log in. if set to None, there is no minimum level and any user is accepted.
SAML_IDP_MINIMUM_LEVEL: Optional[str] = None
 
SAML_REQUEST_USE_POST_BINDING = False
 
SAML_CASE_SENSITIVE_MATCHING = True
 
# Required attributes
SAML_USER_USERID_ATTR = "uid"
SAML_USER_EMAIL_ATTR = "email"
SAML_USER_GROUPS_ATTR = "eclecticiqGroups"
SAML_USER_ROLES_ATTR = "eclecticiqRoles"
 
# Optional attributes
SAML_USER_FIRSTNAME_ATTR = "givenName"
SAML_USER_LASTNAME_ATTR = "sn"
SAML_USER_IS_ADMIN_ATTR = "isEclecticIQAdmin"
SAML_ADMIN_ROLE_GROUP_NAME = "EclecticIQAdminsGroup"
 
 
SAML_SIGN_AUTHN_REQ = False
SAML_SIGN_LOGOUT_REQ = False
SAML_WANT_ASSERT_SIGNED = True
SAML_WANT_RESPONSE_SIGNED = False
 
# Key and certificates; these should point to filenames.
SAML_ENC_KEY: Optional[str] = None
SAML_ENC_CERT: Optional[str] = None
 
# Path to the system's xmlsec1 binary.
SAML_XMLSEC_BIN = "/usr/bin/xmlsec1"
 
SAML_METADATA_ORG = {
"name": "Default organization name",
"display_name": [("Organization display name in English", "en")],
"url": "http://example.localhost",
}
SAML_METADATA_CONTACT_PERSON = {
"given_name": "John",
"sur_name": "Smith",
"email_address": ["[email protected]"],
"contact_type": "technical",
}
 
# In case you want users from SAML to be added to certain groups/roles
# regardless of what groups/roles the SAML server tells us.
# Deprecated. Use `EXTERNAL_` settings instead.
SAML_USER_DEFAULT_GROUPS: Sequence[str] = []
SAML_USER_DEFAULT_ROLES: Sequence[str] = []
 
# External user settings
# In case you want users from an external identity provider to be added to
# certain groups/roles regardless of what groups/roles the identity provider
# sends.
EXTERNAL_USER_DEFAULT_GROUPS: Sequence[str] = []
EXTERNAL_USER_DEFAULT_ROLES: Sequence[str] = []
 
#
# OAuth2
#
OAUTH_ENABLED: bool = False
OAUTH_DEBUG: bool = False
OAUTH_TENANT_ID: Optional[str] = None
OAUTH_APPLICATION_ID: Optional[str] = None
OAUTH_CLIENT_APPLICATION_IDS: Sequence[str] = []
OAUTH_CASE_SENSITIVE_MATCHING: bool = True
 
# Required attributes
OAUTH_USER_USERID_CLAIM: str = "oid"
OAUTH_USER_EMAIL_CLAIM: str = "email"
OAUTH_USER_GROUPS_CLAIM: str = "groups"
OAUTH_USER_ROLES_CLAIM: str = "groups"
 
# Optional attributes
OAUTH_USER_FIRSTNAME_CLAIM: Optional[str] = "given_name"
OAUTH_USER_LASTNAME_CLAIM: Optional[str] = "family_name"
OAUTH_USER_IS_ADMIN_CLAIM: Optional[str] = "isEclecticIQAdmin"
OAUTH_ADMIN_ROLE_NAME: Optional[str] = "EclecticIQAdminsGroup"
 
 
#
# Notifications
#
# The total number notifications can build up over time. If there are too many
# notifications the notifcation index endpoint — which is heavily used by the
# UI — can become slow. Therefore a clean up is done on a regular basis. This
# variable sets after how many new notifcations a clean-up should occur.
NOTIFICATIONS_CLEAN_UP_EVERY = 1000
# Set the maximum number of notifications that should be kept for each user.
# The oldest notifications will be removed when this number is exceeded.
NOTIFICATIONS_MAX_PER_USER = 1000
 
#
# Observable and entity rules
#
 
# The request timeout in seconds when running an extract rule manually.
# This timeout will be used when running "scroll" queries against Elasticsearch.
OBSERVABLE_RULE_ES_TIMEOUT = 300
 
# The request timeout in seconds when running an entity rule manually.
# This timeout will be used when running "scroll" queries against Elasticsearch.
ENTITY_RULE_ES_TIMEOUT = 300
#
# Miscellaneous
#
 
# A feature flag to unlock some potentially very destructive operations.
ALLOW_EXTREMELY_UNSAFE_OPERATIONS = False
 
# A feature flag to enable the subscription module, which was developed by Fusion Center
SUBSCRIPTION_MODULE_ACTIVE = False
SUBSCRIPTION_MODULE_PLATFORM_URL = "https://www.eclecticiq.com/external/"
 
# Max number of items sent to Elasticsearch during bulk requests
ES_BULK_CHUNK_SIZE = 5000
# Max size of the request body of Elasticsearch requests
ES_BULK_PAYLOAD_MB = 5
 
# Observables retention policies are broken. Disable by default.
DISABLE_OBSERVABLES_RETENTION_POLICIES = True
 
# History events
WRITE_HISTORY_EVENTS = False