Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

add elasticsearch db.statement sanitization #1598

Merged

Conversation

nemoshlag
Copy link
Member

@nemoshlag nemoshlag commented Jan 24, 2023

Description

Added an optional query sanitizer to the elasticsearch instrumentation.
Usage
ElasticsearchInstrumentor().instrument(sanitize_query=True)

This will affect the DB_STATEMENT value to contain the original query or sanitized one.

Fixes #1545
Following the specification discussion here

Type of change

Please delete options that are not relevant.

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • Test A db.statement has been sanitized
  • Test B no side affects occurred on a query without a db.statement attribute

Does This PR Require a Core Repo Change?

  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@nemoshlag nemoshlag requested a review from a team January 24, 2023 10:28
Copy link
Contributor

@avzis avzis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a description of the param 'sanitize_query' in the init file (line 54)

@shalevr
Copy link
Member

shalevr commented Jan 24, 2023

Can we use the same sanitization function for all of the db libraries?

@nemoshlag
Copy link
Member Author

I think that this could be an option for all SQL-based DBs, but if we focus on "specific sanitization" for all DB libraries this has some potential to over complicated this issue

CHANGELOG.md Outdated
- `opentelemetry-instrumentation-redis` Add `sanitize_query` config option to allow query sanitization. ([#1572](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1572))
- Add default query sanitization for elasticsearch db.statement attribute
([#1545](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1545))
- `opentelemetry-instrumentation-redis` Add `sanitize_query` config option to allow query sanitization.
Copy link
Member

@shalevr shalevr Feb 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merge with the main accidentally adds this entry

CHANGELOG.md Outdated
@@ -9,7 +9,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- `opentelemetry-instrumentation-redis` Add `sanitize_query` config option to allow query sanitization. ([#1572](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1572))
- Add default query sanitization for elasticsearch db.statement attribute
([#1545](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1545))
Copy link
Member

@shalevr shalevr Feb 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your PR is not #1598 ?

@shalevr
Copy link
Member

shalevr commented Feb 5, 2023

LGTM

@srikanthccv srikanthccv merged commit df32e8c into open-telemetry:main Feb 10, 2023
@nemoshlag nemoshlag deleted the add/elasticsearch-sanitization branch February 13, 2023 12:01
@phillipuniverse
Copy link
Contributor

FYI this does not work with the elasticsearch bulk API, see #1868

@@ -135,11 +138,16 @@ def _instrument(self, **kwargs):
tracer = get_tracer(__name__, __version__, tracer_provider)
request_hook = kwargs.get("request_hook")
response_hook = kwargs.get("response_hook")
sanitize_query = kwargs.get("sanitize_query", False)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the default be True? See also open-telemetry/opentelemetry-specification#3104

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement sensitive data sanitization for elasticsearch instrumentation
6 participants