Skip to content

Drastically Improve Speed of Import #435

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 3 commits into from
Apr 22, 2025
Merged

Drastically Improve Speed of Import #435

merged 3 commits into from
Apr 22, 2025

Conversation

LarsKue
Copy link
Contributor

@LarsKue LarsKue commented Apr 22, 2025

Importing BayesFlow has been very slow. Here is an example:

%%time
import bayesflow as bf
>>> CPU times: user 13 s, sys: 2.62 s, total: 15.6 s
>>> Wall time: 13 s

Some of this time is taken up by importing keras, which differs by the backend used:

%%time
import keras  # using tensorflow backend
>>> CPU times: user 3.88 s, sys: 294 ms, total: 4.17 s
>>> Wall time: 1.55 s

Using a simple line-profiler, I could track the issue to inspect.stack(), inspect.getmodule() and inspect.ismodule(). On import, these are primarily used in 2 places:

  1. in _add_imports_to_all()
  2. in @serializable

I refactored these utilities to avoid the expensive calls to inspect, making the import time now primarily dominated by the import to the respective deep learning framework:

%%time
import bayesflow as bf
CPU times: user 4.29 s, sys: 819 ms, total: 5.11 s
Wall time: 2.5 s

So, for our library, this is a speed-up of approximately $(13s - 1.55s) / (2.5s - 1.55s) \approx 1200$%.

Since this is a sensitive change to the library, I would like to ask that you take extra care when reviewing this. @vpratz I requested your review primarily for the _add_imports_to_all function. Could you add a test that checks if it is working correctly? @stefanradev93 I would like you to sign off on merging this as well.

@LarsKue LarsKue added the efficiency Some code needs to be optimized label Apr 22, 2025
@LarsKue LarsKue self-assigned this Apr 22, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The PR aims to drastically improve the import speed of BayesFlow by refactoring expensive utility calls and updating tests.

  • Replaces expensive inspect calls with sys._getframe in the serialization utilities.
  • Updates tests to use keras.ops for tensor conversion and a custom assert_allclose implementation.
  • Refactors the all population function to streamline module inclusion.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
tests/test_utils/test_dispatch.py Updated tests to use keras.ops for tensor conversion and assert_allclose.
bayesflow/utils/serialization.py Replaced inspect calls with sys._getframe and updated dict merging in deserialization.
bayesflow/utils/_docs/_populate_all.py Refactored all population using sys._getframe and types.ModuleType.
Comments suppressed due to low confidence (1)

bayesflow/utils/serialization.py:104

  • The 'builtins' module is referenced without an explicit import. Please add 'import builtins' at the top of the file to ensure its correct resolution.
module_objects=np.__dict__ | builtins.__dict__

@LarsKue LarsKue moved this from Future to In Progress in bayesflow development Apr 22, 2025
Copy link

codecov bot commented Apr 22, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines Coverage Δ
bayesflow/utils/_docs/_populate_all.py 95.45% <100.00%> (+5.98%) ⬆️
bayesflow/utils/serialization.py 90.56% <100.00%> (+0.18%) ⬆️

@vpratz
Copy link
Collaborator

vpratz commented Apr 22, 2025

Awesome, looking forward to having this! I have compared the outputs of the doc generation with and without your change to _populate_all.py, and the same files are produced, so I think those changes are working as intended.

@stefanradev93 stefanradev93 merged commit 1e63803 into dev Apr 22, 2025
15 checks passed
@stefanradev93 stefanradev93 deleted the optimize-import branch April 22, 2025 20:53
@github-project-automation github-project-automation bot moved this from In Progress to Done in bayesflow development Apr 22, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
efficiency Some code needs to be optimized
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants