-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
New validator instances are built for each instancefile, discarding any internal caching #463
Comments
There's a secondary bug here, which is that the retrieve callable's in-memory caching is not working correctly because it's mixing absolute and relative URIs. I have a potential fix for this pair of issues which I need to think about for a little before I proceed: $ git diff HEAD
diff --git a/src/check_jsonschema/schema_loader/main.py b/src/check_jsonschema/schema_loader/main.py
index 4ce95c9..88ff6a0 100644
--- a/src/check_jsonschema/schema_loader/main.py
+++ b/src/check_jsonschema/schema_loader/main.py
@@ -1,5 +1,6 @@
from __future__ import annotations
+import functools
import pathlib
import typing as t
import urllib.error
@@ -130,11 +131,21 @@ class SchemaLoader(SchemaLoaderBase):
instance_doc: dict[str, t.Any],
format_opts: FormatOptions,
fill_defaults: bool,
+ ) -> jsonschema.protocols.Validator:
+ return self._get_validator(format_opts, fill_defaults)
+
+ @functools.lru_cache
+ def _get_validator(
+ self,
+ format_opts: FormatOptions,
+ fill_defaults: bool,
) -> jsonschema.protocols.Validator:
retrieval_uri = self.get_schema_retrieval_uri()
schema = self.get_schema()
schema_dialect = schema.get("$schema")
+ if schema_dialect is not None and not isinstance(schema_dialect, str):
+ schema_dialect = None
# format checker (which may be None)
format_checker = make_format_checker(format_opts, schema_dialect)
diff --git a/src/check_jsonschema/schema_loader/resolver.py b/src/check_jsonschema/schema_loader/resolver.py
index c63b7bb..5084328 100644
--- a/src/check_jsonschema/schema_loader/resolver.py
+++ b/src/check_jsonschema/schema_loader/resolver.py
@@ -79,8 +79,8 @@ def create_retrieve_callable(
else:
full_uri = uri
- if full_uri in cache._cache:
- return cache[uri]
+ if full_uri in cache:
+ return cache[full_uri]
full_uri_scheme = urllib.parse.urlsplit(full_uri).scheme
if full_uri_scheme in ("http", "https"):
@@ -100,8 +100,8 @@ def create_retrieve_callable(
else:
parsed_object = get_local_file(full_uri)
- cache[uri] = parsed_object
- return cache[uri]
+ cache[full_uri] = parsed_object
+ return cache[full_uri]
return retrieve_reference This makes validator creation cached internally for the main That said, I need to at the very least devise some tests for these to ensure there are no regressions in the future. |
It would be amazing to get this fixed - then I can integrate it efficiently in the CI setup I am planning to :) |
I wasn't able to work on this last week, but I've just put the above into a PR with a test to run against. I should be able to get this all merged and released soon, assuming that there are no surprises when CI runs. |
v0.29.1 has the above fix and is freshly available. I tested it against the usage described in #451 and saw execution times of around 0.8s on average. Please let me know (comment or fresh issue) if the new version is still performing in an unexpectedly bad way! |
I can confirm the execution time very fast now - thanks! |
First reported via #451
For each file, we get a validator from the "schema loader" objects. This interface is correct for the top-level checker. It allows, for example, metaschema checking to proceed with different schemas for different files.
However, this also is being applied to the remote and local class --
SchemaLoader
.As a result, a new validator is built for each file, and any internal caching done under the validator is discarded.
To resolve, one of two approaches should be used:
(schema, settings)
where there might not be any current settings)The text was updated successfully, but these errors were encountered: