Skip to content

Conversation

jg-rp
Copy link
Owner

@jg-rp jg-rp commented Aug 9, 2025

This PR includes breaking changes for both the Python API and some subtle changes to the default JSONPath syntax.

We've changed the tokens produced by the JSONPath lexer, changed the internal representation of JSONPath queries and rewritten the parser.

There are also some new features. See the change log for details.

@jg-rp
Copy link
Owner Author

jg-rp commented Aug 10, 2025

Some JSONPath performance notes, before attempting any new optimizations.

This benchmark is run on lots of small JSONPath queries with small data.

Main branch (89c0e7e)

(python-jsonpath) james@Jamess-Mac-mini python-jsonpath % python scripts/benchmark.py 
repeating 436 queries 100 times, best of 3 rounds
compile and find               1.392
compile and find (values)      1.400
just compile                   0.917
just find                      0.392
just find (values)             0.395

v2 branch (e41ec29)

(python-jsonpath) james@Jamess-Mac-mini python-jsonpath % python scripts/benchmark.py
repeating 436 queries 100 times, best of 3 rounds
compile and find               1.461
compile and find (values)      1.471
just compile                   0.949
just find                      0.413
just find (values)             0.418

@rob-ross
Copy link
Contributor

I am testing my Lexer against your test_lex.py code. It's still a work in progress. But I have converted your test data into a json file. You can get it here .

The only changes I made are :

  1. I changed fake root to pseudo root
  2. I wrapped each test case in a dict/object with a single member "Token". I think this helps make the json file a little more clear, although it introduces a slight wrinkle in your deserialization.

I'll probably be converting more of your tests like this as I proceed. It would make a little more work for you on your end to use them, as you'd have to write a load() method to deserialize them. But it would help us both out in the long run as we could each capture new bugs in the same file without having to modify any python code. And it would help me as you add new features, as I could use test-driven development with updated versions of the file after you introduce new features.

I hope this is useful!

  • Rob

@jg-rp
Copy link
Owner Author

jg-rp commented Aug 13, 2025

I have converted your test data into a json file.

Looks good 👍 I do like "golden files", especially when they apply to multiple projects.

Notice that this pull request - on the v2 branch - has changed tokens produced by the lexer quite a bit. Don't feel obliged to follow v2 instead of main, but it does fix some of the inconsistencies you pointed out in our previous discussions. And, with these changes, we will be able to configure JSONPath to strictly follow RFC 9535 without exception.

@rob-ross
Copy link
Contributor

Well it didn't take me long to sour on that idea of wrapping the tokens in a Map. It literally doubles the amount of code I have to write in Java to deserialize it. lol. It's extra characters and thus file size in the json file. So I'm redoing it to be a simpler JSON format, which will also make it easier to load in Python. I can migrate test_lex.json to use the JSON file. I'll probably work on it tomorrow. For me.

@jg-rp jg-rp marked this pull request as ready for review August 23, 2025 07:47
@jg-rp jg-rp merged commit 207f202 into main Aug 25, 2025
26 checks passed
@jg-rp jg-rp deleted the v2 branch August 25, 2025 07:10
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants