Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[feature request] Add option to remove metric series that are no longer present in observable measurements #5950

Open
Noahnc opened this issue Nov 4, 2024 · 4 comments · May be fixed by #5997
Labels
bug Something isn't working metrics Metrics signal related pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package

Comments

@Noahnc
Copy link

Noahnc commented Nov 4, 2024

Package

OpenTelemetry

Is your feature request related to a problem?

No response

What is the expected behavior?

We are using the ObservableGauge provided by the dotnet Meter to create dynamic metric series, that are then exported by the OTEL-Exporter and sent via Grafana Alloy to Grafana (Mimir). As far as we can tell, there is currently no way to remove a metric series from the exporter once a measurement was collected. When no measurement is provided for a series by the measurement function, the exporter continues to export the last known measurement value until the application is restarted. We have some use-cases with dynamic data, where we need to be able to stop exporting a series without restarting the entire application.

Which alternative solutions or features have you considered?

We already considered setting the value of the series to a pre-defined value like 0 or -100, to then filter all series with this value in Grafana or an Open Telemetry Processor (like Grafana Alloy). This works for some of our use-cases, but still has a lot of configuration overhead.

Additional context

No response

@Noahnc Noahnc added enhancement New feature or request needs-triage New issues which have not been classified or triaged by a community member labels Nov 4, 2024
@github-actions github-actions bot added the pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package label Nov 4, 2024
@cijothomas
Copy link
Member

I believe this is the same bug shown here : https://github.com/open-telemetry/opentelemetry-dotnet/pull/5952/files

@cijothomas cijothomas added bug Something isn't working metrics Metrics signal related and removed enhancement New feature or request needs-triage New issues which have not been classified or triaged by a community member labels Nov 4, 2024
@Noahnc
Copy link
Author

Noahnc commented Nov 5, 2024

Hello @cijothomas

Thanks for pointing out your pull request.
It seems like you are right regarding the SDK not following specs in asynchronous collection.
The behavior described here is precisely what we need.

@cijothomas
Copy link
Member

@Noahnc Would you have time/interest in checking and offering a PR with a fix?

@stonkie
Copy link

stonkie commented Nov 21, 2024

I was facing the same issue and decided to tackle a fix tonight before seeing this discussion.

What led me down this path was that I tried disposing the Meter and it partially worked. It calls System.Diagnostics.Metrics.Instrument.NotifyForUnpublishedInstrument() which eventually sets OpenTelemetry.Metrics.Metric.Active = false and eventually stops the associated series.

This has two unwanted side-effects.

  1. The staleness marker (called a NoRecordedValue flag in OTLP) is not set by the instrumentation, so the receiver (in our case a Collector with PrometheusRemoteWrite) uses its own staleness timeout logic. This repeats the stale value for a short time (5 minutes in our situation).
  2. The solution leaks a Metric out of the default 1000 limit as per this comment.

I'm working on a fix to defragment the metrics list after removals and send NoRecordedValue data points when a metric turns Inactive. See very basic WIP code that still fails some tests.

@cijothomas Is that design acceptable? There is a comment about keeping the removed metric and reusing it if it gets recreated nstead, but that seemed risky because it would leak storage for metrics that rotated without reusing the same identities...

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working metrics Metrics signal related pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package
Projects
None yet
3 participants