Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Python bindings for Frame #343

Merged
merged 14 commits into from
Nov 10, 2022
Merged

Conversation

tmadlener
Copy link
Collaborator

@tmadlener tmadlener commented Oct 27, 2022

BEGINRELEASENOTES

  • Add python bindings for Frame based I/O
    • Available from podio.root_io and podio.sio_io, where a Reader and a Writer is implemented for each.
    • Wrapper around podio::Frame. Requires that the podio/Frame.h header is available somewhere on the ROOT_INCLUDE_PATH.
  • Add necessary functionality for python bindings to C++ API
    • untyped Frame::get method for getting collections
    • New constructor from FrameDataT&&
    • functionality to inspect file and Frame contents more easily
  • Reorganize python code into structure that follows the usual python packaging conventions a bit more closely
    • Introduce the podio module. Make CMake generate the __init__.py with the correct version
    • Move everything except the generator script into module. Additionally also keep an EventStore wrapper to not break existing code.
  • Refactor the CMakeLists.txt that is responsible for building the core and all required I/O libraries
    • Build more dictionaries for more python bindings.

ENDRELEASENOTES

  • Actual python bindings for the Frame
  • more tests (especially for the Frame bindings)

Details

These are a few implementation and technical details

python Reader implementation / interface

Due to the essentially equivalent interfaces of the c++ readers, the readers are implemented via the BaseReaderMixin that essentially defines the complete interface. The backend specific readers simply have to instantiate the approriate c++ reader via PyROOT. The tests follow the same strategy via the ReaderTestCaseMixin base class.

The python version of the readers offers a slightly more pythonic access pattern to some of the functionality by some thin wrapper code. The following helper classes facilitate this

  • FrameCategoryIterator: the iterator class that grants access to all Frames of a given category, returned by Reader.get

python Frame implementation

I could not get roots dictionary generation to swallow the c++ implementation of the podio::Frame (segfaults, have to investigate and report upstream). Instead what is currently done is to JIT it via ROOT.gInterpreter.LoadFile('podio/Frame.h') and then wrap the resulting c++ class into a slightly more accessible python interface. In order to make the Frame constructible from the cppyy wrapped std::unique_ptr<FrameData> that are returned by the readers, an additional Frame constructor Frame(FrameDataT&&) is necessary, because cppyy seems to "swallow" the pointer here somewhere.

CMakeLists.txt refactoring

To enable python bindings more easily, all built shared libraries now have an accompanying dictionary, exposing the necessary c++ classes. To avoid even more code repetition I introduced a PODIO_ADD_LIB_AND_DICT function that handles most of the boilerplate.

@tmadlener tmadlener force-pushed the frame-py-bindings branch 3 times, most recently from 92834ea to d748885 Compare November 1, 2022 15:56
@tmadlener tmadlener linked an issue Nov 1, 2022 that may be closed by this pull request
@tmadlener tmadlener changed the title [WIP] Python bindings for Frame Python bindings for Frame Nov 1, 2022
Preparatory work for things to come
- Wrappers around readers for ROOT and SIO backends
- Wrappers for Frame
- Iterator wrapper for more pythonic feel
- Basic unittests covering the most important functionality

NOTE: The current implementation requires that the <prefix>/include
directory is placed on the ROOT_INCLUDE_PATH
- Wrappers around readers for ROOT and SIO backends
- Wrappers for Frame
- Iterator wrapper for more pythonic feel
- Basic unittests covering the most important functionality

NOTE: The current implementation requires that the <prefix>/include
directory is placed on the ROOT_INCLUDE_PATH
ROOT_INCLUDE_PATH from the environment is necessary for picking up
nlohmann json in the dictionary
Easier to work with in the python world, especially formatting
@tmadlener
Copy link
Collaborator Author

@wdconinc @gaede @andresailer @vvolkl @hegner any strong feelings on the python interface here? Anything obvious still missing?

@wdconinc
Copy link
Contributor

wdconinc commented Nov 5, 2022

Does it make sense to include (for documentation purposes?) an example or test case with uproot, awkward, or numpy?

@tmadlener
Copy link
Collaborator Author

Does it make sense to include (for documentation purposes?) an example or test case with uproot, awkward, or numpy?

In principle I think it would. I am a bit hesitant to put it into documentation at the moment, because I am afraid, there will be a few changes still to the format of the root files, before we have a v1.0. E.g., one of the things that I would still like to tackle before that is, #169.

The one main issue I see with this, is that documenting this "officially" will tie podio very tightly into an expected file format for root. So I really want to be sure that things are stable, before we essentially make that a sort of interface as well.

@wdconinc
Copy link
Contributor

wdconinc commented Nov 7, 2022

I agree with having a stable interface before 'advertising' this. Issues #169 and #85 linked therein are indeed both very relevant for podio usability in python.

@tmadlener tmadlener merged commit e8ae38b into AIDASoft:master Nov 10, 2022
@tmadlener tmadlener deleted the frame-py-bindings branch November 10, 2022 16:34
tmadlener added a commit to tmadlener/podio that referenced this pull request Dec 22, 2022
tmadlener added a commit to tmadlener/podio that referenced this pull request Dec 23, 2022
tmadlener added a commit to tmadlener/podio that referenced this pull request Dec 23, 2022
tmadlener added a commit that referenced this pull request Jan 16, 2023
* Use defaultdict instead of essentially hand-rolling one

* Fix cmake configure dependencies again after #343

* Move SIO utilities to existing private utils header

* Use sio_utils also for legacy writer
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

python bindings do not yet support Frame
2 participants