-
-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
perf: Improve enum matching performance #81
Conversation
|
||
impl Validate for NumberEnumValidator { | ||
fn validate<'a>(&self, schema: &'a JSONSchema, instance: &'a Value) -> ErrorIterator<'a> { | ||
if !self.is_valid(schema, instance) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can avoid duplicating this method.
} | ||
} | ||
|
||
fn name(&self) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is also could be without duplication
let mut strings = vec![]; | ||
for item in items { | ||
match item { | ||
Value::Array(_) => types |= PrimitiveType::Array, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is possible to exit this loop somehow early if a different type occured
Some(EnumValidator::compile(schema)) | ||
if let Value::Array(items) = schema { | ||
let mut types = PrimitiveTypesBitMap::new(); | ||
let mut numbers = vec![]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we can avoid creating all these vectors
Codecov Report
@@ Coverage Diff @@
## master #81 +/- ##
==========================================
- Coverage 81.87% 81.65% -0.22%
==========================================
Files 45 46 +1
Lines 3227 3320 +93
==========================================
+ Hits 2642 2711 +69
- Misses 585 609 +24
Continue to review full report at Codecov.
|
I'm not entirely sure about this, the specifications mentions
I don't see any specification of the values in the array should be of the same type and actually the tests provided by JSON-Schema-Test-Suite reflect this: https://github.com/json-schema-org/JSON-Schema-Test-Suite/blob/master/tests/draft7/enum.json#L41 I would say that in order to improve enum validation performances we might do something like // This is dummy code ... ;)
struct EnumValidator {
array: Vec<Value>,
string: Vec<String>,
....
} for all the types and then during validation we do iterate on the "correct" array only according to the instance to validate ... this would save us some runtime and be compliant with the specificatons |
You are right, I probably didn't reflect my intentions properly. The idea is that we can optimize check only for the case when all enum variants are of the same type and fallback to the current implementation otherwise so that the invariant stated in the spec still holds. I.e. if we have a string input value and we already know that there are no strings in the enum, then we can skip checking every element and immediately return false because all comparison operations will return false anyway because of a different type.
Indeed it should improve the performance in general, I'll explore this option as well! :) |
I mentioned that type of approach because if we're going to end merging #86 then for every instance to validate we don't even need to identify the type :). It might be a bit heavier in terms of bit in memory but I don't think that it would be a very high overhead |
PoC for #80
Currently, for the
enum
bench, it gives 15% slower compilation, but 35-55% faster validation for valid numeric values and up to 50% improvement for validation of invalid values. It also gives some slight improvements forsmall schema
bench, because ofenum: ["hello", "world"]
in it.There are definitely possibilities to improve this PR