About entity rules#

When you ingest large quantities of data, you are likely to introduce noise that can clutter your database.

Noisy data can make analysis more time-consuming and labor-intensive.

Wading through a large data soup that includes meaningful information, as well as unnecessary data that does not yield any relevant intelligence value can slow down the analysts’ decision-making process, and it can make it more error-prone.

This has an impact, among others, on prevention and response timeliness.

Entity and observable rules are highly customizable to give you granular control over your data.

For example, you can create rules to target specific entities or observables from predefined data sources, and then automatically add them to a detection or prevention system, or mark them for exclusion to reduce data noise.

Entity rules enable you to automatically:

  • Assign taxonomy tags.

  • Add entities to a dataset as a post-processing step after completing ingestion.

  • Set entity aliases.

  • Merge almost identical versions of an entity.

Entity rules are automatically triggered in the following cases:

  • When entities are deduplicated.

  • When merging multiple entities into a single entity.

  • When attaching a new data source to, or when removing an existing data source from, an entity.

  • When setting an alias for an entity title.

  • When adding, removing, or updating entity relationships.


About rule execution and entity changes

  • Changes to the data section of an entity create a new version of the entity.

    They also add a new log entry to the entity history to record the changes.

  • Changes to the meta section of an entity do not create a new version of the entity.

    However, they do update the timestamp value of the last_update_at database field.

About last_update_at and outgoing feeds

  • Update strategies rely on the last_updated_at database field to identify entities whose timestamp value was updated since the previous execution of the outgoing feed.

    Entities with a more recent timestamp value compared to the previous execution of the outgoing feed are packaged and included in the published content of the outgoing feed.