Check ingestion performance#
To identify performance issues as part of a troubleshooting process, Intelligence Center administrators can inspect ingestion metrics to understand how long background ingestion tasks wait before execution, how long it takes to execute them, and if the execution outcome is successful or not.
About ingestion performance metrics#
As part of their monitoring and maintenance tasks, Intelligence Center administrators check ingestion performance to retrieve information that can help them answer questions such as:
Is new data coming in?
How fast is EclecticIQ Intelligence Center ingesting data?
How long does it take to process all queues and to complete ingestion?
Are incoming feeds working as expected, or are any slow feeds creating potential bottlenecks?
Is it advisable to increase the number of concurrent ingestion workers to to boost data throughput and to reduce ingest time?
Are system resources coping with the workload, or is it advisable to scale the system up or out?
Ingestion metrics provide data that can help answer questions such as these, and to make informed decisions about which courses of actions to pursue.
In EclecticIQ Intelligence Center, it is possible to inspect these metrics in Kibana, where you can submit search queries to the statsite* Elasticsearch indices .
To measure ingestion performance, you can query the statsite* indices with specific metrics:
Search for this metric… |
…to answer this question |
---|---|
|
How long does it take to execute tasks, based on:
|
|
How long are tasks pending, based on:
|
|
What is the size of batches for tasks that support batching, based on:
|
|
How many packages a queue holds, based on:
|
|
How many times are tasks deduplicated, based on:
|
|
How many tasks failed or were retried over time, based on:
|
|
How many tasks completed successfully or were retried over time, based on:
|
View Statsite logs in Kibana#
To view statsite*
index logs, access Kibana.
Access Kibana#
To access Kibana:
In the web browser address bar enter a URL with the following format:
https://${platform_host_name}/private/kibana/app/kibana#
Keep the trailing #
Example:
https://eclecticiq.platform.org/private/kibana/app/kibana#
Select the statsite*
index#
In Kibana select Discover.
In the Discover view select the following index:
- statsite*: it collects metrics about ingested and
received packages, invalid or not well-formed lines in the ingested packages, as well as ingestion speed and performance.
You can search for specific subsets by entering key/value pairs in the search input field.
Example:
grp:packets_received; grp:bad_lines_seen
To adjust the time interval, in the top-right corner click the clock icon , and choose an appropriate time range for the search.
Search the statsite* metrics#
In Kibana, Statsite metrics live in the ingestion namespace.
Within this functionality area, you can search for ingestion-related metrics using filters such as group, target, and action, to look for a specific metric (the action) measuring a task (the target), which controls ingestion for an incoming feed with a designated priority level (the group).
Search filters |
Description |
---|---|
|
Filters the scope. The ingestion namespace groups logs and metrics related to Intelligence Center ingestion. |
|
Filters functionality areas inside the namespace. Inside ingestion, this filter groups incoming feeds and their designated priority level. |
|
Filters tasks, workers, or services responsible for the functionality areas specified in the grp filter. For example, tasks that ingest and process packages, process observables and enrichments, and finalize the ingestion process. |
|
Filters the type of metrics used to measure an outcome. Available act metrics types:
|
|
Filters how metrics are expressed. Allowed values:
|
As a rule of thumb, start from a relatively loose filter, and then start drilling down as needed, based on the search results Kibana returns.
Examples#
Add a filter to search the statsite* index for any ingestion metrics:
Click Add a filter +.
- In the filter editor, select the following filter,
operator, and value:
ns
is
ingestion
Click Save.
Add a filter to search the statsite* index for ingestion metrics about waiting time in ingestion queues:
Click Add a filter +.
In the filter editor, select the following filter, operator, and value:
ns
is
ingestion
Click Add a filter +.
In the filter editor, select the following filter, operator, and value:
act
is
wait_time
Click Save.
Add a filter to search the statsite* index for ingestion metrics about waiting time in ingestion queues related to the task that is responsible for package ingestion:
Click Add a filter +.
In the filter editor, select the following filter, operator, and value:
ns
is
ingestion
Click Add a filter +.
In the filter editor, select the following filter, operator, and value:
act
is
wait_time
Click Add a filter +.
In the filter editor, select the following filter, operator, and value:
tgt
is
ingest_blob_task
Click Save.
Add a filter to search the statsite* index for ingestion metrics about waiting time in ingestion queues related to the task that is responsible for package ingestion, and a specific feed with ID 2 and priority level set to 100:
Click Add a filter +.
In the filter editor, select the following filter, operator, and value:
ns
is
ingestion
Click Add a filter +.
In the filter editor, select the following filter, operator, and value:
act
is
wait_time
Click Add a filter +.
In the filter editor, select the following filter, operator, and value:
tgt
is
ingest_blob_task
Click Add a filter +.
In the filter editor, select the following filter, operator, and value:
grp
is
f2-p100
Click Save.
Visualize metrics#
Metrics indicators actually stored and available for lookup may vary depending on the host environment and its configuration.
Create metrics visualizations#
To explore metrics currently available in your system:
In Kibana select Visualize.
In the New Visualization dialog click the Data Table visualization type.
In the Choose search source view click the statsite* search index.
In the New Visualization, statsite* view, Data tab, under Buckets, Select buckets type click Split Rows.
From the Aggregation drop-down menu select Terms.
From the Field drop-down menu select a field from the selected search index.
For example, select
ns
.To adjust the time interval, in the top-right corner click the clock icon , and choose an appropriate time range for the search.
The table is populated with the top 5 ns
values in the
statsite* search index.
By default, the table view includes:
The top 5 values for the selected
ns
namespace term.The total number of metrics available for each namespace value.
To narrow down the search filter and to start drilling down to explore more specific metrics, add sub-buckets; that is, additional filters based on other JSON fields.
To list more specific metrics currently available in your system:
In the current statsite* view, Data tab, under Buckets click Add sub-buckets.
Under Select buckets type click Split Rows.
From the Aggregation drop-down menu select Terms.
From the Field drop-down menu select a field from the selected search index.
For example, select
grp
.
The table is populated with the top 5 grp
values for
each top 5 ns
value in the statsite* search index.
By default, the table view includes:
The top 5 values for the selected
ns
andgrp
namespace terms.The total number of metrics available for each
ns
+grp
namespace value pair.
Visualize ingestion metrics#
Create a dedicated visualization in Kibana to view ingestion-specific metrics.
To explore all available metrics that measure data ingestion in EclecticIQ Intelligence Center:
In Kibana select Visualize.
In the New Visualization dialog click the Data Table visualization type.
In the Choose search source view click the statsite* search index.
In the New Visualization, statsite* view, Data tab, under Buckets, Select buckets type click Split Rows.
From the Aggregation drop-down menu select Terms.
From the Field drop-down menu select
act
.In the Size input field enter
100
.Click the Options tab.
In the Per Page input field, set the pagination limit to
100
.To adjust the time interval, in the top-right corner click the clock icon , and choose an appropriate time range for the search.
In the search input field enter
ns:ingestion
.Click Refresh or press ENTER.
The table is populated with the first 100 type of metrics that measure data ingestion performance in EclecticIQ Intelligence Center.
The table view includes:
Up to 100 names defining the type of metrics – currently there are fewer than 100 metrics data points measuring ingestion.
The total number of metrics available for each
act
type of metrics.
Visualize processing time per incoming feed#
Create a dedicated visualization in Kibana to review how much time incoming feed ingestion workers spend on ingesting feed data.
To review how much time ingestion workers spend on ingesting incoming data, and how much time ingestion workers spend on specific incoming feeds:
In Kibana select Visualize.
In the New Visualization dialog click the Data Table visualization type.
In the Choose search source view click the statsite* search index.
In the New Visualization, statsite* view, Data tab, under Buckets, Select buckets type click X-Axis.
From the Aggregation drop-down menu select Date Histogram.
From the Field drop-down menu select
@timestamp
.From the Interval drop-down menu select the desired time interval to use as a data bucket for the axis.
For example: Auto.
In the New Visualization, statsite* view, Data tab, under Metrics click Y-Axis.
From the Aggregation drop-down menu select Sum.
From the Field drop-down menu select
sum
.To adjust the time interval, in the top-right corner click the clock icon , and choose an appropriate time range for the search.
In the search input field enter
act:"run_time"
.Click Refresh or press ENTER.
The histogram area is populated with a visual representation of the time spent ingesting data – Y axis – for the time interval bucket defined for the X axis.
You can further refine the histogram view to display the time spent ingesting data per incoming feed:
In the New Visualization, statsite* view, Data tab, under Buckets click Add sub-buckets.
Under Select buckets type select Split Series.
From the Sub Aggregation drop-down menu select Terms.
From the Field drop-down menu select
grp
.Set the Size value depending on the number of incoming feeds that are ingesting data concurrently, and whose processing time you want to inspect.
For example: 20
The histogram area is populated with a visual representation of the time spent ingesting data by each active incoming feed – Y axis – for the time interval bucket defined for the X axis.
Incoming feeds on the histogram are color-coded.
Feed labels include the incoming feed ID and the processing priority assigned to the feed.
Format:
f${feed ID}-p${priority value}
Example:
f42-p100
refers to an incoming feed with ID42
and with priority100
.
Retrieve a feed ID#
To retrieve an incoming feed ID:
In the side navigation bar, go to Data configuration > Incoming feeds.
In the Incoming feeds overview, click anywhere in the row corresponding to the incoming feed whose ID you want to retrieve.
In the web browser address bar, the URL of the active Intelligence Center view is similar to the following example:
https://${platform_host_name}/main/configuration/incoming-feeds?detail=42
In the URL, the
detail
URL parameter holds the feed ID.In the example, the ID value is
42
.
About incoming feed priority#
Incoming feeds can be assigned a priority level to allocate more or fewer Intelligence Center resources during the ingestion process.
Feed priority is an integer within a range:
The lowest value is zero.
The highest value is set in
/etc/eclecticiq/platform_settings.py
:INGESTION_FEED_PRIORITY = 100
By default, ingestion feed priority value is set to
100
.