Run diagnostic tests#

About the diagnostic suite#

The eiq-platform diagnose set of commands automates running basic diagnostic tests and collecting test results to examine system-level setup and operation such as Intelligence Center configuration, database availability, and basic system health (for example: free disk space).

This diagnostic suite of commands runs the checks, collects Intelligence Center logs, and it produces a system report. Run inside venv

The eiq-platform command line utility is available in the Intelligence Center Python virtual environment: activate it and, if necessary, import EclecticIQ Intelligence Center environment variables.

  • Open a terminal session.

  • In the terminal, log in to EclecticIQ Intelligence Center through SSH:

    # EclecticIQ Intelligence Center user name must have admin access rights
    ssh ${platform_user_name}@${platform_host_name}
    
    # Or:
    ssh ${platform_user_name}@${platform_ip_address}
    
    # Example:
    ssh [email protected]
    
    # Or:
    ssh [email protected]
    
  • Obtain root-level access by running sudo -i:

    # Root-access login shell
    sudo -i
    
  • To access resources as a different user than the currently active one, append -u:

    # Grant the currently logged in user root-level access
    sudo -i
    
    # Grant root-level access to a different user
    sudo -i -u ${user_name}
    
    # Run a command as a different user, with root-level access
    sudo -i -u ${user_name} ${command} ${options}
    
  • Activate a Python virtual environment for EclecticIQ Intelligence Center:

    source /opt/eclecticiq-platform-backend/bin/activate
    

Run from the command line#

eiq-platform diagnose is the top-level umbrella for the command set. It includes the following commands:

Diagnostic test command

Description

eiq-platform diagnose run

Runs the diagnostic test suite.

eiq-platform diagnose logs

Produces a tarball containing tailed Intelligence Center logs.

eiq-platform diagnose report

Produces a tarball containing sosreport reports.

eiq-platform diagnose query

Produces a report with the results of a set of Intelligence Center health queries.

eiq-platform diagnose elasticsearch

Produces basic metrics about the health of your ElasticSearch configuration.

For further details about each command, append --help to it to view the available switches for the specific command.

Run the diagnostic suite#

eiq-platform diagnose run is a diagnostic command to run basic tests on EclecticIQ Intelligence Center for troubleshooting purposes.

Command#

Start a diagnostic test suite run:

eiq-platform diagnose run

Switches#

Switch

Type

Description

Required

Default

-f, --formatter

text

Defines the output format of the report:

  • colored

  • json

  • print

Yes

colored

-d, --disk-free

integer

Defines the free disk space threshold.

The value is expressed as a percentage.

If the actual free space available is lower than this value, the test suite aborts and fails.

Yes

10

-t, --timeout

integer

Defines the maximum amount of time each check can run before it returns a timeout error.

The value is expressed in seconds.

Yes

20

Result#

The diagnostic suite runs the following diagnostic test cases from the command line:

  • Database availability:

    • PostgreSQL, Redis, Elasticsearch.

    • The database connection test case uses a predefined 5-second timeout to avoid hanging.

  • Free disk space and average disk load.

    • The free disk test checks the following mount point locations:

      • /var/log

      • /media/elasticsearch

      • /media/pgsql

    • If a file system mounted to any scanned mount point has less than 10% free space, the test aborts.

    • If a 15-minute long average disk load exceeds twice the logical CPU unit count, the test fails.

  • Operational states of the systemd-managed processes, including ingestion tasks and Celery workers.

  • Overdue Celery and ingestion tasks.

If a test fails, the command returns a non-zero exit code.

By default, the report with the results is output to stdout.

Get Intelligence Center logs#

eiq-platform diagnose logs is a diagnostic command to collect Intelligence Center logs for investigation and troubleshooting purposes.

Command#

Obtain a tarball archive containing EclecticIQ Intelligence Center logs:

eiq-platform diagnose logs

Switches#

Switch

Type

Description

Required

Default

-d, --log-dir

text

Defines the source directory the command collects logs from.

Yes

/var/log/eclecticiq

-o, --result-tarball

text

Defines the target directory and the file name for the resulting tar archive containing the collected logs.

Yes

logs.tar.xz

-n, --lines

integer

Number of lines, starting from the tail end of each log file, to include in the generated output.

Yes

1000

Result#

eiq-platform diagnose logs takes N lines – by default, 1000 — from each log ground in the specified source log directory – by default, /var/log/eclecticiq/ — and it adds the resulting tailed log files to an archive – by default, logs.tar.xz.

During execution the command notifies you that some of the collected logs may contain sensitive data – for example, database passwords, IP addresses, etc – that you may not want to expose.

Before distributing the content of the collected logs, it is a good idea to review it to make sure no sensitive content is exposed.

Get Intelligence Center profiles#

eiq-platform diagnose profiles is a diagnostic command to collect Intelligence Center profiles for investigation and troubleshooting purposes.

Command#

eiq-platform diagnose profile [OPTIONS] SERVICE

Switches#

Switch

Type

Description

Required

Default

-f, --format

text

Defines the output format of the report: - speedscope - flamegraph - raw

No

flamegraph

-o, --output

text

Defines the target directory and the file name for the resulting file.

No

current/directory/profile_[profiled service]_[ISO 8601 Timestamp].svg

-d, --duration

integer

The number of seconds to sample for.

No

unlimited

-r, --rate

integer

The number of samples to collect per second.

No

100

--background

n/a

Run in the background.

No

--idle / --no-idle

n/a

Include stack traces for idle threads.

No

idle

Profileable services#

  • platform-backend-ingestion@[worker number]

  • platform-backend-opentaxii

  • platform-backend-scheduler

  • platform-backend-searchindex

  • platform-backend-web

  • platform-backend-worker@discovery-priority

  • platform-backend-worker@discovery

  • platform-backend-worker@enrichers-priority

  • platform-backend-worker@enrichers

  • platform-backend-worker@entity-rules-priority

  • platform-backend-worker@extract-rules-priority

  • platform-backend-worker@incoming-transports-priority

  • platform-backend-worker@incoming-transports

  • platform-backend-worker@outgoing-feeds-priority

  • platform-backend-worker@outgoing-feeds

  • platform-backend-worker@outgoing-transports-priority

  • platform-backend-worker@outgoing-transports

  • platform-backend-worker@reindexing

  • platform-backend-worker@retention-policies-priority

  • platform-backend-worker@retention-policies

  • platform-backend-worker@synchronization

  • platform-backend-worker@utilities-priority

  • platform-backend-worker@utilities

Result#

A py-spy process records a profile of the targetted service. The process runs without impacting the code and creates a file with the profiling information in the chosen format.

Get sosreports#

eiq-platform diagnose report is a diagnostic command to collect sosreport reports holding diagnostic and support data for investigation and troubleshooting purposes.

Command#

Obtain a tarball archive containing EclecticIQ Intelligence Center logs:

eiq-platform diagnose report

Switches#

Switch

Type

Description

Required

Default

-k, --plugin-option

text

Defines one or more specific sosreport plugins.

You configure default plugin options in the sos.conf file.

Yes

n/a

-u, --no-sudo

n/a

Do not switch to sudo.

By default, the command tries to run with sudo privileges because all sosreport plugins require root permissions.

If you append this switch, command execution skips all built-in sosreport plugins, and it runs only plugins that do not need root-level access.

Yes

False

Result#

eiq-platform diagnose report collects system data, including basic OS load metrics.

By default, report collects data up to the previous 7 days. You can change the time interval by editing /etc/sysstat/sysstat.

The generated report tarball name used the following format: sosreport-HOST.CASE_ID-DATE.tar.xz.

The archive includes an HTML index file to make it easier to explore the content.

Default name and location: sosreport-HOST.CASE_ID-DATE/sos_reports/sos.html.

The generated report archive includes sar files – binary databases and plain text output.

You can graphically render the binary file content with tools such as Sarjitsu or sarvant.

Get Intelligence Center health#

eiq-platform diagnose query allows selecting and running a specific custom sosreport plugin query, and it prints the output to the terminal.

Command#

Run a specific custom sosreport plugin query:

eiq-platform diagnose query ${query_name} -k ${query_param}=${query_param_value}

Switches#

Switch

Type

Description

Required

Default

-l, --list

list

Returns a list with the available custom queries that you can run.

Yes

n/a

-k ${query_param}

Depends on the selected query type

Defines one or more optional configuration parameters for the specified plugin query, along with their corresponding values.

  • Format: key/value pairs

    Example: limit=20

  • They are always preceded by -k

    Example: -k limit=20

Yes

n/a

Obtain a list of the available custom plugins that you can run with query:

eiq-platform diagnose query -l

Currently available plugin queries:

Custom plugin query

Description

Parameters

Type

Default value

Example

eiq_task_run_status

Returns the status of task runs in the past N days, where N represents the specified number of days.

It helps detect task processing anomalies such as an unusually high number of pending or failed task runs.

days

integer

7

eiq-platform diagnose query eiq_task_run_status -k days=7

eiq_hotspot

Returns hotspots detected in EclecticIQ Intelligence Center ingestion tables.

It helps identify potential outliers such as entities and observables with a very high number of connections.

  • The output contains only entities and observables with a higher number of connections than the value assigned to threshold.

  • The output includes only the top N entities, where N is the maximum amount of matches to include in the output as defined in limit.

threshold

limit

integer

integer

20000

20

eiq-platform diagnose query eiq_hotspot -k threshold=20000 -k limit=20

Result#

eiq-platform diagnose query ${query_name} outputs to the terminal information that can be helpful when troubleshooting issues dealing with hanging or failing tasks, and decreased database performance due to hotspots or heavy data access.

Get ElasticSearch configuration health#

eiq-platform diagnose elasticsearch is a diagnostic tool to obtain information about the health of your ElasticSearch configuration, especially in relation to search performance.

The tool differs from the ElasticSearch’s own diagnostic tool in that it only provides the most relevant metrics in summary form, as opposed to a wealth of data that can often be time-consuming to analyze.

ElasticSearch optimization is a complex task and this diagnostics tool does not aim to give you all the answers to your performance issues. However, it does highlight the most common causes of sub-optimal performance and suggests which ones to look at first.

Note

The diagnostics tool gives you a snapshot in time based on how much data you currently store in your EclecticIQ Intelligence Center instance.

As your Intelligence Center instance ingests more data, you will need to run the tool periodically in order to help ensure that your ElasticSearch configuration is adequate for the amount of data you store.

Note

Some of the links in this section refer to ElasticSearch documentation. Check the ElasticSearch documentation for the relevant articles that apply to your specific version of ElasticSearch.

Command#

Obtain basic metrics on the health of your ElasticSearch configuration:

eiq-platform diagnose elasticsearch

Switches#

This command does not have any switches.

Result#

Cluster health#

A cluster’s health is determined by the extent to which all of its shards have been allocated.

For more information, see:

Cluster settings#

Cluster settings checks that ElasticSearch is properly configured for EclecticIQ Intelligence Center.

For more information, see

Disk space#

During Intelligence Center upgrades, each ElasticSearch cluster node requires at least 50% free disk space. If the amount of free space available to ElasticSearch falls below 15% your cluster’s master node may have difficulty allocating shards and controlling normal cluster operations.

For more information, see:

Shard size#

The size of a node’s shards influences the speed with which ElasticSearch recovers from failure, as well as search and indexing performance.

For more information, see:

Shard number#

The number of shards affects the amount of memory you need for the JVM heap. You may have too many shards.

For more information, see:

JVM heap size#

The size of the JVM heap affects the efficiency with which ElasticSearch’s cached memory is managed. If the heap is too small, search performance will suffer from excessive calls to external resources; if it is too big, search performance will suffer from the need to manage all that memory.

For more information, see:

Shared resources#

For this metric, we have assumed that you are running ElasticSearch on a server of its own and have allocated at least 40% of the server’s total memory to ElasticSearch. You should then have capacity for up to five million entities and observables. This metric is intended to make you aware of the need to follow ElasticSearch sizing and capacity planning guidelines as your data grows.

For more information, see:

Read-only indexes#

Elasticsearch automatically imposes a read-only block on an index when disk utilization falls below the high watermark controlled by cluster.routing.allocation.disk.watermark.flood_stage.

The diagnostic tool’s “disk space” metric indicates whether disk utilization has fallen too low and by how much. If it has fallen too low, free up space so that Elasticsearch can write to the indexes concerned.

Caution

If an index block has been set manually, remove it by executing the following command:

PUT /<INDEX NAME>/_settings { "index.blocks.read_only_allow_delete": false }

For more information, see:

Migration execution#

You may not have performed the Elasticsearch migration properly. This is usually because the procedure for migrating the EclecticIQ databases was not performed correctly.

Repeat the database migration procedure—following the instructions exactly—and then run eiq-platform search upgrade again.

For more information, see: