Use state machine to parse directives #3243

djc · 2025-03-25T20:54:26Z

Intended to fix #3174. Have not done any benchmarks yet -- suggestions on how best to approach that are welcome.

This is a little bit more code, but IMO maybe slightly more readable?

It's correct enough that it passes the tests, there might be edge cases that aren't covered but those should be easy to solve.

tracing-subscriber/src/filter/env/directive.rs

dpc · 2025-03-26T00:57:25Z

> hyperfine --warmup 30 -N ./target/release/dotr-before ./target/release/dotr-after
Benchmark 1: ./target/release/dotr-before
  Time (mean ± σ):       1.2 ms ±   0.1 ms    [User: 0.4 ms, System: 0.7 ms]
  Range (min … max):     1.1 ms …   1.9 ms    2044 runs

Benchmark 2: ./target/release/dotr-after
  Time (mean ± σ):     613.8 µs ±  79.6 µs    [User: 289.1 µs, System: 254.0 µs]
  Range (min … max):   511.5 µs … 979.8 µs    5264 runs

Summary
  ./target/release/dotr-after ran
    2.03 ± 0.34 times faster than ./target/release/dotr-before

Added true just to have a baseline:

Benchmark 1: ./target/release/dotr-before
  Time (mean ± σ):       1.2 ms ±   0.1 ms    [User: 0.4 ms, System: 0.7 ms]
  Range (min … max):     1.1 ms …   1.8 ms    2535 runs

Benchmark 2: ./target/release/dotr-after
  Time (mean ± σ):     611.7 µs ±  80.3 µs    [User: 295.6 µs, System: 244.1 µs]
  Range (min … max):   518.8 µs … 1663.5 µs    3825 runs

Benchmark 3: true
  Time (mean ± σ):     595.8 µs ±  74.1 µs    [User: 311.4 µs, System: 215.4 µs]
  Range (min … max):   512.3 µs … 1035.5 µs    5362 runs

This app sets RUST_LOG internally, so setting it from the outside doesn't make a difference.

AFAICT you fixed it and it's nearly free now.

klensy · 2025-03-27T12:07:20Z

After this, regex can be removed from Cargo.toml (for tracing-subscriber), only left in dev-deps?

djc · 2025-03-27T12:21:16Z

After this, regex can be removed from Cargo.toml (for tracing-subscriber), only left in dev-deps?

Nice catch, yes! Added that into the first commit.

tracing-subscriber/src/filter/env/directive.rs

hds

Could you please rebase this against master. We merge everything in there first and then David or I will handle backporting to v0.1.x.

Before making this change, I think we need more tests to ensure that the behavior isn't changing. Especially because there are things I'm seeing in the code that are different to what the docs say.

I'm happy to have a look at writing some of those tests, but it may take me a little bit to get to it.

tracing-subscriber/src/filter/env/directive.rs

With a view to replacing the env filter parsing in #3243, this change adds some additional tests to improve our confidence in not breaking existing behavior. Tests for empty and invalid directives is added, as well as tests for directives overriding less specific directives. One of the latter tests (`more_specific_dynamic_filter_less_verbose`) currently fails, which is a known issue reported in #1388. The test is in place with `#[should_panic]` and can be reverted to a normal test when that behavior is fixed. The documentation on `parse`, `parse_lossy`, `from_env_lossy` and `try_from_env` has also been made more explicit as to the result of an empty directive (the default directive is used). This was already documented on `with_default_directive`.

tracing-subscriber/src/filter/env/directive.rs

djc · 2025-04-30T16:27:24Z

tracing-subscriber/src/filter/env/directive.rs

+        for (i, c) in from.trim().char_indices() {
+            state = match (state, c) {
+                (Start, '[') => Span { span_start: i + 1 },
+                (Start, c) if !c.is_alphanumeric() => return Err(ParseError::new()),


I added this arm to pass the new test cases from #3262. This seems like a reasonable way to weed out invalid stuff while allowing non-ASCII target names?

I think we need to match the previous regex here, and is_alphanumeric is missing a couple of characters.

In the current implementation, a target accepts [\w:-] so as well as : and -, we need everything specified by \w, which the regex docs specify as: word character (\p{Alphabetic} + \p{M} + \d + \p{Pc} + \p{Join_Control}).

Is there some way we can get those character classes in Rust?

Otherwise, the real thing that made that test invalid is that it started with a comma. I'm happy to modify the test if we want to accept starting with a comma (and ignore empty directives). I think it would be acceptable to be more permissive about what characters we accept for the target (or a span name for example), but we should not be more restrictive.

djc · 2025-04-30T16:28:22Z

Rebased on master
Added back comment
Added additional arm to pass new test cases from subscriber: increase EnvFilter test coverage #3262

hds · 2025-05-02T15:23:32Z

tracing-subscriber/src/filter/env/directive.rs

+        for (i, c) in from.trim().char_indices() {
+            state = match (state, c) {
+                (Start, '[') => Span { span_start: i + 1 },
+                (Start, c) if !c.is_alphanumeric() => return Err(ParseError::new()),


I think we need to match the previous regex here, and is_alphanumeric is missing a couple of characters.

In the current implementation, a target accepts [\w:-] so as well as : and -, we need everything specified by \w, which the regex docs specify as: word character (\p{Alphabetic} + \p{M} + \d + \p{Pc} + \p{Join_Control}).

Is there some way we can get those character classes in Rust?

Otherwise, the real thing that made that test invalid is that it started with a comma. I'm happy to modify the test if we want to accept starting with a comma (and ignore empty directives). I think it would be acceptable to be more permissive about what characters we accept for the target (or a span name for example), but we should not be more restrictive.

djc requested review from hawkw and a team as code owners March 25, 2025 20:54

djc mentioned this pull request Mar 25, 2025

tracing-subscriber's Directive parsing adds 600us to startup time #3174

Open

dpc reviewed Mar 26, 2025

View reviewed changes

tracing-subscriber/src/filter/env/directive.rs Outdated Show resolved Hide resolved

djc force-pushed the parse-directive branch from 12445d7 to b298124 Compare March 27, 2025 12:21

joshka reviewed Mar 29, 2025

View reviewed changes

tracing-subscriber/src/filter/env/directive.rs Outdated Show resolved Hide resolved

djc force-pushed the parse-directive branch 2 times, most recently from d533cdf to a4484c3 Compare March 29, 2025 10:38

hds requested changes Apr 11, 2025

View reviewed changes

tracing-subscriber/src/filter/env/directive.rs Show resolved Hide resolved

tracing-subscriber/src/filter/env/directive.rs Show resolved Hide resolved

hds mentioned this pull request Apr 30, 2025

subscriber: increase EnvFilter test coverage #3262

Merged

djc force-pushed the parse-directive branch from a4484c3 to bab5257 Compare April 30, 2025 16:07

djc requested a review from yaahc as a code owner April 30, 2025 16:07

djc changed the base branch from v0.1.x to master April 30, 2025 16:07

djc requested a review from davidbarsky as a code owner April 30, 2025 16:07

djc force-pushed the parse-directive branch from bab5257 to 2007997 Compare April 30, 2025 16:21

djc commented Apr 30, 2025

View reviewed changes

tracing-subscriber/src/filter/env/directive.rs Outdated Show resolved Hide resolved

Use state machine to parse directives

06e5695

djc force-pushed the parse-directive branch from 2007997 to 06e5695 Compare April 30, 2025 16:26

djc commented Apr 30, 2025

View reviewed changes

djc requested a review from hds April 30, 2025 16:28

hds requested changes May 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use state machine to parse directives #3243

Use state machine to parse directives #3243

djc commented Mar 25, 2025

dpc commented Mar 26, 2025 •

edited

Loading

klensy commented Mar 27, 2025 •

edited

Loading

djc commented Mar 27, 2025

hds left a comment

djc Apr 30, 2025

hds May 2, 2025

djc commented Apr 30, 2025

hds May 2, 2025

Use state machine to parse directives #3243

Are you sure you want to change the base?

Use state machine to parse directives #3243

Conversation

djc commented Mar 25, 2025

dpc commented Mar 26, 2025 • edited Loading

klensy commented Mar 27, 2025 • edited Loading

djc commented Mar 27, 2025

hds left a comment

Choose a reason for hiding this comment

djc Apr 30, 2025

Choose a reason for hiding this comment

hds May 2, 2025

Choose a reason for hiding this comment

djc commented Apr 30, 2025

hds May 2, 2025

Choose a reason for hiding this comment

dpc commented Mar 26, 2025 •

edited

Loading

klensy commented Mar 27, 2025 •

edited

Loading