Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Load Pyodide runtime from external capnproto file #2430

Merged
merged 1 commit into from
Jul 26, 2024

Conversation

hoodmane
Copy link
Contributor

These changes are needed to load the Pyodide runtime from an external capnproto file. In workerd, we'll keep loading Pyodide in the same way, but we moved this to WorkerdAPI. In our internal codebase, we now download the bundle and call setPyodideBundleData when we load the first Python worker. Then when setting up the module registry, we add the bundle from pyodideBundleGlobal.

TODO in followups:

  1. Make workerd load the bundle in the same way
  2. Add support for multiple versions of the Pyodide bundle

@hoodmane hoodmane requested review from a team as code owners July 24, 2024 09:26
@hoodmane hoodmane force-pushed the hoodmane/pyodide-load-from-file-2 branch from 9a69cc8 to 0f5e39a Compare July 24, 2024 10:19
@hoodmane hoodmane requested review from dom96 and garrettgu10 July 24, 2024 10:19
@hoodmane hoodmane force-pushed the hoodmane/pyodide-load-from-file-2 branch from 0f5e39a to 50a1faf Compare July 24, 2024 11:34
hoodmane added a commit that referenced this pull request Jul 24, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
Comment on lines 10 to 18
kj::Maybe<kj::Array<unsigned char>> pyodideBundleDataGlobal = kj::none;
kj::Maybe<kj::Own<capnp::FlatArrayMessageReader>> pyodideBundleReaderGlobal = kj::none;
kj::Maybe<jsg::Bundle::Reader> pyodideBundleGlobal = kj::none;

void setPyodideBundleData(kj::Array<unsigned char> data) {
pyodideBundleReaderGlobal = kj::heap<capnp::FlatArrayMessageReader>(kj::arrayPtr(
reinterpret_cast<const capnp::word*>(data.begin()), data.size() / sizeof(capnp::word)));
pyodideBundleGlobal = KJ_REQUIRE_NONNULL(pyodideBundleReaderGlobal)->getRoot<jsg::Bundle>();
pyodideBundleDataGlobal = kj::mv(data);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel too strongly about this, but I think it might be worth to reduce the number of globals here. Creating a reader should be fast, so maybe you could instead only keep the pyodideBundleDataGlobal global and create the jsg::Bundle::Reader on-demand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the ownership semantics here, I'm concerned that the jsg::Bundle::Reader doesn't own its data and instead the data is owned by one of pyodideBundleReaderGlobal or pyodideBundleDataGlobal, so if I let either of those go out of scope while the ModuleRegistry object is still around, then when a file is imported, we'll read from freed data.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at other examples of capnp::FlatArrayMessageReader it seems it is important to keep it all alive. But it looks like you can reduce the number of globals at least a little bit by using attach: https://github.com/cloudflare/workerd/blob/main/src/workerd/server/workerd.c%2B%2B#L991

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I am planning to eventually move these into a TreeMap or something, but it would be nice not to worry about manually keeping the other stuff alive.

@@ -446,7 +446,9 @@ void WorkerdApi::compileModules(
if (hasPythonModules(confModules)) {
KJ_REQUIRE(featureFlags.getPythonWorkers(),
"The python_workers compatibility flag is required to use Python.");
// Inject pyodide bootstrap module.
// Inject Pyodide bundle
modules->addBuiltinBundle(PYODIDE_BUNDLE, kj::none);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PYODIDE_BUNDLE will get replaced with pyodideBundleGlobal here, right? Maybe add a todo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a todo, I guess @garrettgu10 is working on it =)

@hoodmane hoodmane force-pushed the hoodmane/pyodide-load-from-file-2 branch from 50a1faf to 82f3fd5 Compare July 26, 2024 09:21
These changes are needed to load the Pyodide runtime from an external capnproto
file. In workerd, we'll keep loading Pyodide in the same way, but we moved this
to WorkerdAPI. In our internal codebase, we now download the bundle and call
`setPyodideBundleData` when we load the first Python worker. Then when setting
up the module registry, we add the bundle from `pyodideBundleGlobal`.

TODO in followups:
1. Make workerd load the bundle in the same way
2. Add support for multiple versions of the bundle
@hoodmane hoodmane force-pushed the hoodmane/pyodide-load-from-file-2 branch from 82f3fd5 to 8fcd672 Compare July 26, 2024 19:07
@hoodmane hoodmane merged commit 1bf1c83 into main Jul 26, 2024
9 checks passed
@hoodmane hoodmane deleted the hoodmane/pyodide-load-from-file-2 branch July 26, 2024 19:49
hoodmane added a commit that referenced this pull request Jul 29, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Aug 5, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Aug 20, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Aug 20, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Aug 20, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Sep 4, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Sep 5, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Sep 27, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Oct 8, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Oct 8, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Oct 8, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Oct 23, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Oct 23, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Oct 23, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
hoodmane added a commit that referenced this pull request Oct 23, 2024
This adds a github action that makes a binary file out of the pyodide capnproto
bundle and uploads it to GCS.

TODO:
1. Use the uploaded file (see #2430)
2. Move the pyodide js code and this script to pyodide-build-scripts repo.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants