Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[CI] Adding continuous testing for ECS dynamic templates #97901

Merged
merged 19 commits into from
Sep 11, 2023

Conversation

eyalkoren
Copy link
Contributor

@eyalkoren eyalkoren commented Jul 24, 2023

Closes #96713

All three tests cases are covered:

  • a test document containing all ECS fields with random values in flattened form
  • a test document containing all ECS fields with random values in flattened form and index is set with subobjects: false
  • a test document containing all ECS fields with random values in nested/object form

In addition, I added verification that all ECS multi-field definitions are covered by the ECS dynamic templates (revealing two that actually were not).

@felixbarny @ruflin see if you agree with my comment about not failing if we create multi-field mapping even for fields of which ECS definition does not enforce such.
For example, we define a multi-field mapping for the *.name pattern. If you look into the ECS definitions, you will find that many *.name have multi-field mappings, while many others don't. Trying to be very accurate will result with many more dynamic templates.

@P1llus I assigned you as a reviewer because I think you were waiting for this in order to start migrating some ECS mappings to rely on the builtin dynamic templates.

@elasticsearchmachine elasticsearchmachine added v8.10.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jul 24, 2023
@eyalkoren eyalkoren requested review from felixbarny and P1llus July 27, 2023 15:55
@eyalkoren eyalkoren self-assigned this Jul 27, 2023
@eyalkoren eyalkoren added :Delivery/Build Build or test infrastructure :Data Management/Data streams Data streams and their lifecycles >test Issues or PRs that are addressing/adding tests labels Jul 27, 2023
@eyalkoren eyalkoren marked this pull request as ready for review July 27, 2023 15:58
@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team Team:Delivery Meta label for Delivery team labels Jul 27, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-delivery (Team:Delivery)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Member

@felixbarny felixbarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixbarny @ruflin see if you agree with my comment about not failing if we create multi-field mapping even for fields of which ECS definition does not enforce such.
For example, we define a multi-field mapping for the *.name pattern. If you look into the ECS definitions, you will find that many *.name have multi-field mappings, while many others don't. Trying to be very accurate will result with many more dynamic templates.

Seems like erring on the side of always creating a match_only_text subfield, even for cases where ECS only defines a keyword field for *.name fields.

I think that's fine. While on the one hand, this creates a bit more indexing overhead than strictly required, it brings a nice consistency via the naming convention *.name so that users know they can always do a full text search on fields ending with *.name.

@ruflin
Copy link
Contributor

ruflin commented Jul 31, 2023

I think that's fine. While on the one hand, this creates a bit more indexing overhead than strictly required, it brings a nice consistency via the naming convention *.name so that users know they can always do a full text search on fields ending with *.name.

I'm not a fan of the ECS multi fields but I agree we should be consistent. Do we have an understand on how much the overhead on storage is? I guess hard to tell? Assuming a user would want to disable this, I assume they could overwrite it in @custom and only have the keyword?

@eyalkoren
Copy link
Contributor Author

Do we have an understand on how much the overhead on storage is? I guess hard to tell?

I can't say anything clever about the actual overhead, only that since the default mapping for strings in Elasticsearch is multi-field, then I assume it is at least acceptable. We are only extending ECS a bit here 🙂

Assuming a user would want to disable this, I assume they could overwrite it in @Custom and only have the keyword?

Exactly! And since we added logs@custom before the ECS dynamic templates, any path or pattern coming from it will take precedence.

@eyalkoren
Copy link
Contributor Author

@rjernst @mark-vieira would you be able to review or assign someone to review this PR?

Copy link
Member

@P1llus P1llus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current review comments covers anything I would have added already. I see there was the discussion around multi-fields for .name, which was indeed something annoying for me when testing this out earlier, I did create an issue a long time ago in the ECS repo, but nothing really came of it.
I don't think the concept of multi-fields itself adds overhead, but text fields itself adds alot of overhead for sure, so if we had to choose, I rather not have them be multi-field but that decision is not up to me :)
Anything else seems LGTM.

@breskeby breskeby added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 11, 2023
@breskeby
Copy link
Contributor

I've added a daily ci job with notifications to slack (#es-delivery) and email (logs-plus@elastic.co)

@breskeby breskeby merged commit c43f83d into elastic:main Sep 11, 2023
@eyalkoren eyalkoren deleted the ecs-test branch September 11, 2023 15:26
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Data Management/Data streams Data streams and their lifecycles :Delivery/Build Build or test infrastructure external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Data Management Meta label for data/management team Team:Delivery Meta label for Delivery team >test Issues or PRs that are addressing/adding tests v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] Add ECS-mappings compatibility tests
9 participants