-
Notifications
You must be signed in to change notification settings - Fork 780
Use state machine to parse directives #3243
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: master
Are you sure you want to change the base?
Conversation
Added
This app sets AFAICT you fixed it and it's nearly free now. |
After this, |
Nice catch, yes! Added that into the first commit. |
d533cdf
to
a4484c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please rebase this against master
. We merge everything in there first and then David or I will handle backporting to v0.1.x
.
Before making this change, I think we need more tests to ensure that the behavior isn't changing. Especially because there are things I'm seeing in the code that are different to what the docs say.
I'm happy to have a look at writing some of those tests, but it may take me a little bit to get to it.
With a view to replacing the env filter parsing in #3243, this change adds some additional tests to improve our confidence in not breaking existing behavior. Tests for empty and invalid directives is added, as well as tests for directives overriding less specific directives. One of the latter tests (`more_specific_dynamic_filter_less_verbose`) currently fails, which is a known issue reported in #1388. The test is in place with `#[should_panic]` and can be reverted to a normal test when that behavior is fixed. The documentation on `parse`, `parse_lossy`, `from_env_lossy` and `try_from_env` has also been made more explicit as to the result of an empty directive (the default directive is used). This was already documented on `with_default_directive`.
for (i, c) in from.trim().char_indices() { | ||
state = match (state, c) { | ||
(Start, '[') => Span { span_start: i + 1 }, | ||
(Start, c) if !c.is_alphanumeric() => return Err(ParseError::new()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this arm to pass the new test cases from #3262. This seems like a reasonable way to weed out invalid stuff while allowing non-ASCII target names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to match the previous regex here, and is_alphanumeric
is missing a couple of characters.
In the current implementation, a target accepts [\w:-]
so as well as :
and -
, we need everything specified by \w
, which the regex docs specify as: word character (\p{Alphabetic} + \p{M} + \d + \p{Pc} + \p{Join_Control})
.
Is there some way we can get those character classes in Rust?
Otherwise, the real thing that made that test invalid is that it started with a comma. I'm happy to modify the test if we want to accept starting with a comma (and ignore empty directives). I think it would be acceptable to be more permissive about what characters we accept for the target (or a span name for example), but we should not be more restrictive.
|
for (i, c) in from.trim().char_indices() { | ||
state = match (state, c) { | ||
(Start, '[') => Span { span_start: i + 1 }, | ||
(Start, c) if !c.is_alphanumeric() => return Err(ParseError::new()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to match the previous regex here, and is_alphanumeric
is missing a couple of characters.
In the current implementation, a target accepts [\w:-]
so as well as :
and -
, we need everything specified by \w
, which the regex docs specify as: word character (\p{Alphabetic} + \p{M} + \d + \p{Pc} + \p{Join_Control})
.
Is there some way we can get those character classes in Rust?
Otherwise, the real thing that made that test invalid is that it started with a comma. I'm happy to modify the test if we want to accept starting with a comma (and ignore empty directives). I think it would be acceptable to be more permissive about what characters we accept for the target (or a span name for example), but we should not be more restrictive.
Intended to fix #3174. Have not done any benchmarks yet -- suggestions on how best to approach that are welcome.
This is a little bit more code, but IMO maybe slightly more readable?
It's correct enough that it passes the tests, there might be edge cases that aren't covered but those should be easy to solve.