Address graph ingestion issues
Neo4j is disabled by default from IC 2.12.0 onward. Graphs use a new graph API based on PostgreSQL and Elasticsearch instead. For more information, see Update the settings.
During ingestion, incoming packages containing key values exceeding 4036 bytes in size fail to be ingested because the default Neo4j 3.5.x native-btree-1.0 index provider cannot process key sizes larger than 4036 bytes.
Issue
The platform may fail to ingest packages with very long key values. This scenario can occur with long URI strings containing concatenated URI parameters such as tokens or queries.
The error traceback contains the following error message:
Exception: Ingestion Exception:
Property
value size: ${integer} of ${key value}
is
too large to index into this particular index.
Please see index documentation
for
limitations.
The error message originates from the graph application, and it is included in the platform traceback as is.
Impact
EclecticIQ Platform 2.7.x and earlier with Neo4j 3.5.x.
EclecticIQ Platform 2.7.x and earlier with Neo4j 3.5.x, before upgrading to EclecticIQ Platform 2.8.x.
The issue is solved in EclecticIQ Platform 2.8.x and later.
Cause
The default index provider for Neo4j 3.5.x is native-btree-1.0.
The native B+Tree index has a key size limit of 4036 bytes.
Mitigation
To ingest packages with key values larger than 4036 bytes, you need to enable support for larger key sizes by setting a different index provider for Neo4j 3.5.x.
An alternative index provider is lucene-1.0, whose key size limit is 32766 bytes.
EclecticIQ Platform 2.8.x and later solve the issue by setting lucene-1.0 as the default index provider for Neo4j 3.5.x.
To change native-btree-1.0 to lucene-1.0 as the default index provider for Neo4j 3.5.x in EclecticIQ Platform 2.7.x and earlier, you need to carry out a manual procedure to:
Change index provider in Neo4j
Reindex the graph database with the new index provider
Reingest failed packages, so that the new index provider can correctly process them.
How to set a different index provider for Neo4j
The procedure to change native-btree-1.0 to lucene-1.0 as the default index provider for Neo4j 3.5.x applies to EclecticIQ Platform 2.7.x and earlier.
Get root-level access
To complete the procedure, you must have root-level access in the server hosting the platform, and in the server hosting Neo4j.
Obtain root-level access by running sudo -i:
# Root-access login shell
sudo
-i
To access resources as a different user than the currently active one, append -u:# Grant the currently logged in user root-level access
sudo
-i
# Grant root-level access to a different user
sudo
-i -u ${user_name}
# Run a command as a different user, with root-level access
sudo
-i -u ${user_name} ${
command
} ${options}
Stop backend services and retrieve the Neo4j credentials
In the server hosting the platform:
Stop all EclecticIQ backend services:
systemctl stop $(systemctl list-
units
'eclecticiq*'
|
awk
'{print $1}'
)
Retrieve the user name and password credentials the platform uses to connect to the Neo4j database.
This information is stored In the server hosting the platform, in /etc/eclecticiq/platform_settings.py:grep
'NEO4J_URL\|NEO4J_USER\|NEO4J_PASSWORD'
/etc/eclecticiq/platform_settings
.py
Back up and edit the Neo4j configuration
In the server hosting Neo4j:
Back up the current /etc/eclecticiq-neo4j/neo4j.conf Neo4j configuration file:
cd
/etc/eclecticiq-neo4j/
cp
-p neo4j.conf neo4j.conf.orig
Edit /etc/eclecticiq-neo4j/neo4j.conf to enable the Bolt network protocol and Neo4j Cypher Shell CLI:
# Open Neo4j config file in Vim:
vi
/etc/eclecticiq-neo4j/neo4j
.conf
# In neo4j.conf enable Bolt and Cypher Shell:
dbms.connector.bolt.enabled=True
dbms.shell.enabled=True
# Save and exit
:wq!
Restart the Neo4j service:
systemctl restart neo4j
Change Neo4j index provider
In the server hosting Neo4j:
Open a Cypher Shell instance, and connect to the Neo4j database:
/bin/cypher-shell
-u ${neo4j_username} -p ${neo4j_password}
In Cypher Shell run the following commands to replace the current index provider with lucene-1.0, and to reindex the graph database with lucene-1.0 as the new index provider:
DROP INDEX ON :Extract(value);
CALL db.createIndex(
":Extract(value)"
,
"lucene-1.0"
);
DROP CONSTRAINT ON ( extract:Extract ) ASSERT extract.id IS UNIQUE;
CALL db.createUniquePropertyConstraint(
":Extract(id)"
,
"lucene-1.0"
);
As a rule of thumb, reindexing the graph database using the new index provider can take about 3-5 minutes per million entities.
After completing reindexing, list the new indices:
CALL db.indexes();
Response example:
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| description | label | properties | state | type | provider | failureMessage |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
"INDEX ON :Extract(kind)"
|
"Extract"
| [
"kind"
] |
"ONLINE"
|
"node_label_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :Extract(platform_id)"
|
"Extract"
| [
"platform_id"
] |
"ONLINE"
|
"node_label_property"
| {version:
"2.0"
, key:
"lucene+native"
} |
""
|
|
"INDEX ON :Extract(value)"
|
"Extract"
| [
"value"
] |
"ONLINE"
|
"node_label_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :IntelEntity(meta.source)"
|
"IntelEntity"
| [
"meta.source"
] |
"ONLINE"
|
"node_label_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :IntelEntity(meta.stix_id)"
|
"IntelEntity"
| [
"meta.stix_id"
] |
"ONLINE"
|
"node_label_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :IntelEntity(sources)"
|
"IntelEntity"
| [
"sources"
] |
"ONLINE"
|
"node_label_property"
| {version:
"2.0"
, key:
"lucene+native"
} |
""
|
|
"INDEX ON :IntelEntity(stix_id)"
|
"IntelEntity"
| [
"stix_id"
] |
"ONLINE"
|
"node_label_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :IntelEntity(subtype)"
|
"IntelEntity"
| [
"subtype"
] |
"ONLINE"
|
"node_label_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :IntelEntity(type)"
|
"IntelEntity"
| [
"type"
] |
"ONLINE"
|
"node_label_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :Extract(id)"
|
"Extract"
| [
"id"
] |
"ONLINE"
|
"node_unique_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :Extract(uid)"
|
"Extract"
| [
"uid"
] |
"ONLINE"
|
"node_unique_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :IntelEntity(id)"
|
"IntelEntity"
| [
"id"
] |
"ONLINE"
|
"node_unique_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
|
"INDEX ON :Migration(name)"
|
"Migration"
| [
"name"
] |
"ONLINE"
|
"node_unique_property"
| {version:
"1.0"
, key:
"lucene"
} |
""
|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Disable Bolt and Cypher Shell
In the server hosting Neo4j:
After completing the operation, edit /etc/eclecticiq-neo4j/neo4j.conf to disable the Bolt network protocol and Neo4j Cypher Shell CLI:
# Open Neo4j config file in Vim:
vi
/etc/eclecticiq-neo4j/neo4j
.conf
# In neo4j.conf disable Bolt and Cypher Shell:
dbms.connector.bolt.enabled=False
dbms.shell.enabled=False
# Save and exit
:wq!
Restart the Neo4j service:
systemctl restart neo4j
Start backend services and reingest failed packages
In the server hosting the platform:
Start the platform backend services:
systemctl start eclecticiq-platform-backend-services
Sign in to the platform GUI, and then proceed to reingest failed packages by initiating the action in the corresponding incoming feeds using the options available in the GUI.