Selective Http Trace and Baggage Propagation #1853

martinjt · 2024-05-31T19:58:17Z

Looking for some feedback on a change to the HttpClient instrumentation package I've been playing with.

Ultimately, I'm trying to get to a point where you can choose which HttpClient's will have trace/baggage propagation and which won't. With the ultimate goal of stopping trace/baggage leakage to domains you don't control.

I originally thought about domain allow/deny lists, but I'm not sure that's the right approach. So then I pivoted to adding a handler instead.

Right now, TracePropagation happens in 2 places.
The DiagnosticHandler added by the runtime
The HttpClient instrumentation package listening to a Diagnostic source that is written to by 1

You can disable 1 by using the following 2 switches:

AppContext.SetSwitch("DOTNET_SYSTEM_NET_HTTP_ENABLEACTIVITYPROPAGATION", false);
AppContext.SetSwitch("System.Net.Http.EnableActivityPropagation", false);

This has the effect of disabling propagation for the entire application. Further, it also stops the generation of the outbound HTTP span.

In an ideal world, you'd be able to do this at a HttpClient level, and selective choose:

No span and no propagation
No span but propagate
Span, but no propagation
Span and Propagate

The scenario where this is useful is where you're using a third party API which you want to track the performance of, but you don't want them to receive baggage and trace details.

What I've been toying with is adding something to the Http Instrumentation package that allows you to "turn off" instrumentation, then add it back it in incrementally.

builder.Services.AddOpenTelemetry()
    .WithTracing(tracingOptions => tracingOptions
        .AddHttpClientInstrumentation(httpOptions => {
            httpOptions.DisableAutoPropagation = true;
        })

Then have 2 handlers such that you can do..

builder.Services.AddHttpClient<WeatherApiClient>(client => client.BaseAddress = new("http://backend"))
    .AddHttpMessageHandler<TracingHandlerWithPropagation>();

builder.Services.AddHttpClient<WeatherApiClient>(client => client.BaseAddress = new("http://backend"))
    .AddHttpMessageHandler<TracingHandlerWithoutPropagation>();

Then add that an extension like

builder.Services.AddHttpClient<WeatherApiClient>(client => client.BaseAddress = new("http://backend"))
    .WithOpenTelemetry(options => {
        options.PropagationEnabled = true;
        options.Tracked = true;
    });

(Names TBC)

The issue I see right now is that the DiagnosticListener is the way we listen to the http calls, and that also does propagation.

So I could make it so that Handler generates the Activity directly, and ignore the diagnostic listener. In effect the diagnostic listener would only be relevant for "Auto instrumented" stuff coming out of the runtime. The alternative is to have the handler only generate diagnostics when it should be traced, and not in other scenarios. Or I use a different diagnostic name.

So is this something worthwhile for people? is the approach viable?

The text was updated successfully, but these errors were encountered:

CodeBlanch · 2024-05-31T20:10:36Z

Couldn't you do this?

    public async Task<string> CallThirdPartyServiceAndReturnResponse(Uri requestUri)
    {
        // Turn off flow of Activity & Baggage for this ExecutionContext
        using var flow = ExecutionContext.SuppressFlow();

        // Make the call with no trace or baggage to inject
        return await this.httpClient.GetStringAsync(requestUri);
    }

The nice thing about that is it works (in theory) for all things not just HttpClient. Same thing but calling some message queue:

    public async Task PostMessageToThirdPartyService(Uri requestUri)
    {
        // Turn off flow of Activity & Baggage for this ExecutionContext
        using var flow = ExecutionContext.SuppressFlow();

        // Post message with no trace or baggage to inject
        await this.myAwesomeMessageService.PostMessageAsync(requestUri);
    }

SuppressFlow can be kind of wonky and doesn't work if everything ends up executing synchronously. That might need like Task.Yield before the await to be 100% solid. But conceptually, this is how you turn off flow of AsyncLocal which I think might be helpful for this case.

martinjt · 2024-05-31T20:58:17Z

From what I can tell from testing this, it does stop activity from being created, however, the runtime code does not honor that and still propagates the trace data.

martinjt · 2024-05-31T21:03:04Z

So ultimately, the issue is that I need to disable the runtime propagation, but if I do that (with the switches), then the Http instrumentation doesn't kick in, because it's based on the diagnostic listener that's generated by the DiagnosticHandler which the switches disable.

CodeBlanch · 2024-05-31T21:50:59Z

Ya it will still create an Activity \ span but it shouldn't be parented to the current trace it will be a new root. So you won't leak the internal trace/baggage to the third party. Wasn't that the goal?

If you wanted to stop the Activity \ span from being created completely, really the OTel way to do that is inside the sampler. The problem is though HttpClient isn't passing any tags into the sampler so there isn't really anyway to drop just some specific destination 😢 IIRC the spec says there are some standard tags which should be supplied at sample time it just isn't part of runtime logic yet.

You could probably do something like this:

// At call site
using var scope = MySampler.SuppressActivityCreation();

await httpClient.DoSomethingAsync();

// Set as OTel sampler
internal sealed class MySampler : Sampler
{
    private static readonly AsyncLocal<bool> SuppressActivity = new AsyncLocal<bool>();
    private readonly Sampler innerSampler;

    public MySampler(Sampler innerSampler)
    {
        this.innerSampler = innerSampler ?? throw new ArgumentNullException(nameof(innerSampler));
    }

    public static IDisposable SuppressActivityCreation()
    {
        var previousValue = SuppressActivity.Value;

        SuppressActivity.Value = true;

        return new MySamplerScope(previousValue);
    }

    public override SamplingResult ShouldSample(in SamplingParameters samplingParameters)
    {
        if (SuppressActivity.Value)
        {
            return new SamplingResult(SamplingDecision.Drop);
        }

        return this.innerSampler.ShouldSample(in samplingParameters);
    }

    private sealed class MySamplerScope : IDisposable
    {
        private readonly bool previousValue;

        public MySamplerScope(bool previousValue)
        {
            this.previousValue = previousValue;
        }

        public void Dispose()
        {
            SuppressActivity.Value = this.previousValue;
        }
    }
}

I didn't test any of that just an idea!

lmolkova · 2024-05-31T22:04:27Z

I think there are two fundamental problems:

baggage flows everywhere and you can't control it unless it's enableable (per endpoint/request/etc).
different endpoints have different context propagation formats

Having a propagator configurable per endpoint might solve both.
Also, it's not a unique .NET problem - all languages have to some extent.

We can try to extend Propagator spec so that propagator knows the endpoint/some other context it's propagating to
It probably could be done today with some duct tape: custom baggage propagator and AsyncLocal

class ConfigurablePropagator: Propagator {
   public static readonly AsyncLocal<string> Endpoint = new AsyncLocal<string>(); 
   
   public override void Inject(...) {
       propagators[Endpoint.Value].Inject(...)
   }
}

DistributedContextPropagator Current = DistributedContextPropagator.CreateNoOutputPropagator(); // I hope it works
Sdk.SetDefaultTextMapPropagator(new ConfigurablePropagator(....));

ConfigurablePropagator.Endpoint = "https://external-no-baggage-prop.com";
await httpClient.SendAsync()...
ConditionalPropagator.Endpoint = null;

UPDATE: I assume it should be possible to implement a customizeable DistributedContextPropagator and give .NET a custom propagator.

martinjt · 2024-05-31T22:12:12Z

Ya it will still create an Activity \ span but it shouldn't be parented to the current trace it will be a new root.

@CodeBlanch I'm saying that isn't happening. The propagation happens at the DiagnosticHandler level in the runtime. It's not even touching otel. So it's not creating the Activity, but is still propagating the context.

it's not a unique .NET problem - all languages have to some extent.

JS and go, you have to add the propagation handler from otel to the specific HTTP client. Since dotnet does this in the runtime, there's no way to turn it off. It happens even if you don't have otel installed at all.

martinjt · 2024-06-01T18:53:14Z

This is what we get with the execution flow applied. You'll notice that the second row shows that the backend has used the propagation context from the first.

What we need is the ability to Disable the propagation part, based on a handler of some sort.

What the execution flow IS doing is making the span not inherit the parent, and making propagation be based on the root. Which doesn't give the desired outcome

johncrim · 2024-07-22T16:55:32Z

This sounds super useful to me. Just yesterday I wrote a PrivateBaggageTextMapPropagator for this exact reason, and I would love to have better support for the scenario. My use case is that traceparent can be sent anywhere (it would be better if it wasn't, but I'm not worried about security consequences from an API vendor receiving our traceparent, and in some cases it can be used), but baggage shouldn't be sent to external APIs.

I think the main use case should be opt-in instead of opt-out. I don't think it's reasonable to map all of the possible (transitive) external domains.

I would also be happy if this was something I could explicitly attach to an HttpClient/Handler, or configure with a named client.

My PrivateBaggageTextMapPropagator does a DNS lookup when needed to determine if the service IP is on the private network or on the public internet, and uses that to decide whether to propagate baggage. So my approach is higher overhead than I'd like; the approach you're proposing sounds much better.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Selective Http Trace and Baggage Propagation #1853

Selective Http Trace and Baggage Propagation #1853

martinjt commented May 31, 2024

CodeBlanch commented May 31, 2024

martinjt commented May 31, 2024

martinjt commented May 31, 2024

CodeBlanch commented May 31, 2024

lmolkova commented May 31, 2024 •

edited

Loading

martinjt commented May 31, 2024

martinjt commented Jun 1, 2024

johncrim commented Jul 22, 2024 •

edited

Loading

Selective Http Trace and Baggage Propagation #1853

Selective Http Trace and Baggage Propagation #1853

Comments

martinjt commented May 31, 2024

CodeBlanch commented May 31, 2024

martinjt commented May 31, 2024

martinjt commented May 31, 2024

CodeBlanch commented May 31, 2024

lmolkova commented May 31, 2024 • edited Loading

martinjt commented May 31, 2024

martinjt commented Jun 1, 2024

johncrim commented Jul 22, 2024 • edited Loading

lmolkova commented May 31, 2024 •

edited

Loading

johncrim commented Jul 22, 2024 •

edited

Loading