-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Change python invocation #1908
base: main
Are you sure you want to change the base?
Change python invocation #1908
Conversation
* Use -B to prevent python from writing pyc files, this is wasted effort due to the container being ephemeral * Set check-hash-based-pycs to never, this prevents python from scanning the entire file and calculating its hash to check on pyc hits, instead forces it into the timestamp validation step * Use -OO to force runtime optimizations such as ignorance of assets, debug flags and docstrings
I'm sure these optimizations improve things, but how do we know that and how are we tracking regressions? Can you add some benchmarks to the test suite that demonstrate these are improvements? |
In this case this is part of the "Compile All" tests from the Python Interpreter Speed Tests spreadsheet. It should be noted that this is a split up PR out of that, with the follow up PR focused on generating compiled python bytecode using Sadly I don't have any great ideas around preventing regressions on this front, since general noise can be louder than the signal, and the study above does this 100 times to establish the signal. We could theoretically run the same test in our own test suites however it would take on the order of hours, and running on peoples dev machines might introduce even more noise making it very non-deterministic. Let me know if you have any ideas on how to prevent regressions here. |
this won't help in prod as we override the entrypoint, the same change needs to be repeated elsewhere |
I'll make sure the change is repeated in the k8s, I do think they should be kept as similar as possible though to minimise drift between prod and dev. |
* PYTHONDONTWRITEBYTECODE is the same as specifying -B * PYTHONOPTIMIZE is the same as specifying -OO
Signed-off-by: Will Sackfield <sackfield@replicate.com>
* Mirroring the server side
Signed-off-by: Will Sackfield <sackfield@replicate.com>
Removed the optimize flags in line with the cluster discussion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything that stands out as wrong with this but I'm going to tag in @mattt for a second set of eyes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach seems well-reasoned. I also don't have a good answer to @nickstenning's question about how to track regressions other than to try rolling this out and looking for aggregate error rates and performance metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving so that we can test this out internally before cutting a release. Following the same general approach described in #1858 (review).
ephemeral
calculating its hash to check on pyc hits, instead forces it into the timestamp validation step