chore(sampling): change trace sampling formula #12950

genesor · 2025-03-28T15:49:07Z

This PR changes the formulas used to sample traces & spans in order to have a consistent one across languages.

Two formulas were used:

((trace_id * KNUTH_FACTOR) % 2^64 -1) <= sampling_rate * (2^64 -1) in ddtrace/_trace/sampler.py & ddtrace/_trace/sampling_rule.py
((trace_id * KNUTH_FACTOR) % 2^64) <= sampling_rate * (2^64) in ddtrace/internal/sampling.py

Both have been changed to ((trace_id * KNUTH_FACTOR) % 2^64) <= sampling_rate * (2^64 -1)

There was an hardcoded sampling decision in the http header extractor whenever we were receiving a request with a non-empty trace-id preventing us from applying our own sampling logic.

This PR will allow us to enable the sampling rates system tests.

Checklist

PR author has checked that all the criteria below are met
The PR description includes an overview of the change
The PR description articulates the motivation for the change
The change includes tests OR the PR description describes a testing strategy
The PR description notes risks associated with the change, if any
Newly-added code is easy to change
The change follows the library release note guidelines
The change includes or references documentation updates if necessary
Backport labels are set (if applicable)

Reviewer Checklist

Reviewer has checked that all the criteria below are met
Title is accurate
All changes are related to the pull request's stated goal
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Newly-added code is easy to change
Release note makes sense to a user of the library
If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

github-actions · 2025-03-28T15:49:37Z

CODEOWNERS have been resolved as:

ddtrace/_trace/sampler.py                                               @DataDog/apm-sdk-api-python
ddtrace/_trace/sampling_rule.py                                         @DataDog/apm-sdk-api-python
ddtrace/internal/constants.py                                           @DataDog/apm-core-python
ddtrace/internal/sampling.py                                            @DataDog/apm-sdk-api-python
ddtrace/propagation/http.py                                             @DataDog/apm-sdk-api-python
tests/contrib/aiohttp/test_middleware.py                                @DataDog/apm-core-python @DataDog/apm-idm-python
tests/snapshots/tests.contrib.wsgi.test_wsgi.test_distributed_tracing_nested.json  @DataDog/apm-python
tests/tracer/test_propagation.py                                        @DataDog/apm-sdk-api-python
tests/tracer/test_sampler.py                                            @DataDog/apm-sdk-api-python

github-actions · 2025-03-28T16:08:23Z

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 229 ± 2 ms.

The average import time from base is: 231 ± 2 ms.

The import time difference between this PR and base is: -2.39 ± 0.09 ms.

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 2.007 ms (0.88%)

ddtrace.bootstrap.sitecustomize 1.343 ms (0.59%)

ddtrace.bootstrap.preload 1.343 ms (0.59%)

ddtrace.internal.products 1.343 ms (0.59%)

ddtrace.internal.remoteconfig.client 0.637 ms (0.28%)

ddtrace 0.664 ms (0.29%)

pr-commenter · 2025-03-28T17:31:11Z

Benchmarks

Benchmark execution time: 2025-04-11 14:38:34

Comparing candidate commit 9ef48b7 in PR branch ben.db/APMAPI-1260-update-sampling-modulo with baseline commit 0de8b0e in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 496 metrics, 2 unstable metrics.

remove default USER_KEEP from http extractor

ZStriker19 · 2025-04-11T20:05:54Z

ddtrace/internal/sampling.py

@@ -281,6 +283,8 @@ def _set_sampling_tags(span, sampled, sample_rate, mechanism):
    # Set the sampling priority
    priorities = SAMPLING_MECHANISM_TO_PRIORITIES[mechanism]
    priority_index = _KEEP_PRIORITY_INDEX if sampled else _REJECT_PRIORITY_INDEX
+
+    span.set_metric(_SAMPLING_PRIORITY_KEY, priorities[priority_index])


Why do we need to set the sampling priority on the span directly when we do it on the context below, which eventually ends up being applied to the span? https://github.com/DataDog/dd-trace-py/blob/main/ddtrace/_trace/processor/__init__.py#L217

ZStriker19 · 2025-04-11T20:11:18Z

ddtrace/propagation/http.py

@@ -310,7 +310,10 @@ def _extract(headers):
            headers,
            default="0",
        )
-        sampling_priority = _extract_header_value(POSSIBLE_HTTP_HEADER_SAMPLING_PRIORITIES, headers, default=USER_KEEP)  # type: ignore[arg-type]


Yeah, this seems like it probably shouldn't be with the behavior. Good catch.

ZStriker19

Great job! Just one question!

genesor added changelog/no-changelog A changelog entry is not required for this PR. apm:ecosystems labels Mar 28, 2025

genesor force-pushed the ben.db/APMAPI-1260-update-sampling-modulo branch 2 times, most recently from 9b1b71b to 4425185 Compare March 28, 2025 16:41

genesor force-pushed the ben.db/APMAPI-1260-update-sampling-modulo branch from 4425185 to 7d29178 Compare April 1, 2025 14:29

genesor changed the title ~~fix(sampling): change trace and span sampling formula~~ chore(sampling): change trace and span sampling formula Apr 2, 2025

genesor force-pushed the ben.db/APMAPI-1260-update-sampling-modulo branch from 5e4379f to 6a0d361 Compare April 2, 2025 11:54

genesor changed the base branch from 3.3 to main April 2, 2025 11:56

genesor force-pushed the ben.db/APMAPI-1260-update-sampling-modulo branch 2 times, most recently from ab957c4 to 70c37c8 Compare April 8, 2025 16:13

genesor mentioned this pull request Apr 9, 2025

fix(sampling): update sample rate env var in parametric test DataDog/system-tests#4499

Merged

7 tasks

genesor force-pushed the ben.db/APMAPI-1260-update-sampling-modulo branch 2 times, most recently from f55a39e to 1479752 Compare April 10, 2025 15:37

genesor added 4 commits April 11, 2025 10:17

change trace and span sampling formula

250ef2d

chore(sampling): remove default sampling from http extractor

37d2423

remove default USER_KEEP from http extractor

fix(sampling): fix sampling priority in tests

9c04790

avoid overwriting _dd.p.dm tag

64ee32f

genesor force-pushed the ben.db/APMAPI-1260-update-sampling-modulo branch 2 times, most recently from 6ded685 to 3d56e51 Compare April 11, 2025 08:49

chore(sampling): rename constant imports

9ef48b7

genesor force-pushed the ben.db/APMAPI-1260-update-sampling-modulo branch from 8f146c7 to 9ef48b7 Compare April 11, 2025 13:56

genesor changed the title ~~chore(sampling): change trace and span sampling formula~~ chore(sampling): change trace sampling formula Apr 11, 2025

genesor marked this pull request as ready for review April 11, 2025 13:56

genesor requested review from a team as code owners April 11, 2025 13:56

genesor requested review from rachelyangdog, juanjux, Yun-Kim and quinna-h April 11, 2025 13:56

ZStriker19 reviewed Apr 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(sampling): change trace sampling formula #12950

chore(sampling): change trace sampling formula #12950

genesor commented Mar 28, 2025 •

edited

Loading

github-actions bot commented Mar 28, 2025 •

edited

Loading

github-actions bot commented Mar 28, 2025 •

edited

Loading

pr-commenter bot commented Mar 28, 2025 •

edited

Loading

ZStriker19 Apr 11, 2025

ZStriker19 Apr 11, 2025

ZStriker19 left a comment

chore(sampling): change trace sampling formula #12950

Are you sure you want to change the base?

chore(sampling): change trace sampling formula #12950

Conversation

genesor commented Mar 28, 2025 • edited Loading

Checklist

Reviewer Checklist

github-actions bot commented Mar 28, 2025 • edited Loading

github-actions bot commented Mar 28, 2025 • edited Loading

Bootstrap import analysis

Summary

Import time breakdown

pr-commenter bot commented Mar 28, 2025 • edited Loading

Benchmarks

ZStriker19 Apr 11, 2025

Choose a reason for hiding this comment

ZStriker19 Apr 11, 2025

Choose a reason for hiding this comment

ZStriker19 left a comment

Choose a reason for hiding this comment

genesor commented Mar 28, 2025 •

edited

Loading

github-actions bot commented Mar 28, 2025 •

edited

Loading

github-actions bot commented Mar 28, 2025 •

edited

Loading

pr-commenter bot commented Mar 28, 2025 •

edited

Loading