-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Json extract #524
Json extract #524
Conversation
I hoped this feature wouldn't drop me into a rabbit hole 😅 Unfortunately, there are some quirks in the JSON syntax, ex: -- "\x61" is an escaped ASCII character for 'a', but this doesn't match in neither sqlite nor postgres:
SELECT json_extract('{"\x61": 1}', 'a');
-- you need to use the exact sequence of characters:
SELECT json_extract('{"\x61": 1}', '\x61')
1
-- the other way around also wouldn't work:
SELECT json_extract('{"a": 1}', '\x61') |
core/json/json_path.rs
Outdated
|
||
#[derive(Parser)] | ||
#[grammar_inline = r#" | ||
array_locator = @{ "[" ~ array_index ~ "]" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to copy a large part of grammar from core/json/de.rs
. Perhaps I could extract the grammar from there into a constant and add additional rules for parsing the path? However, this would leave us with a few unused rules.
The other way to DRY it out would be to extract only the common rules, but I feel like it's brittle and very unhandy if we ever have to patch the JSON grammar.
i merged #504 so base is now main |
84f74cc
to
b401178
Compare
cc089af
to
332b831
Compare
332b831
to
2e730ea
Compare
6a5d800
to
eb5bbe6
Compare
@petersooley I saw your PR with I like the idea of a smaller, hand-rolled parser but I also think the JSON grammar is pretty complicated, even in terms of what you can do with the JSON path in SQLite (see some tests attached). Let me know your thoughts. |
0b639f4
to
d970d00
Compare
d970d00
to
692301e
Compare
SQLite seems to be changing some quirky behavior from version to version:
|
@madejejej The path parsing is much simpler than the JSON parsing, for sure. It's also very limited in sqlite (i.e. no glob/wildcard patterns). It's mostly array indexes and object property paths with a few extra cases. What I like about the hand-rolled solution is that it doesn't separate out the path parsing from accessing the value in the JSON. That allows returning early as soon as the JSON value has no match for the path. No matter which way we go, there's always a loop required to drill into the JSON and extract a value at the end of the given path. Both solutions are doing that loop anyway, it's just that the hand-rolled solution is doing it during path parsing. |
Yeah this is a decent point, we'd save some deserialization overhead by traversing the JSON object on demand while parsing the path. Maybe not the most important thing in the world and can be optimized later, but it's also nice to implement things right the first time around. I wouldn't block this PR from going forward with the eagerly-parsed version so I'll let @madejejej decide |
I agree that it feels better. However we should be able to parse any valid JSON
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine to roll with this now, we can optimize the impl later with handrolled parsing it if really becomes necessary
Implements the
json_extract
function.In the meantime, the json path has already been implemented by @petersooley in #555 which is a requirement for
json_extract
.However, this PR takes a different approach and parses the JSON path using the JSON grammar, because there are a lot of quirks in how a JSON
key
can look (see the JSON grammar in the Pest file).The downside is that it allocates more memory than the current implementation, but might be easier to maintain in the long run.
I included a lot of tests with some quirky behavior of the
json_extract
(some of them still need some work). I also noticed that these changed between sqlite versions (hadSQLite 3.43.2
locally and3.45
gave different results). Due to this, I'm not sure how much value there is in trying to be fully compatible with SQLite. Perhaps the approach taken by @petersooley solves 99% of use-cases?