Skip to content

Releases: MilesCranmer/PySR

v0.9.0

04 Jun 19:58
c3dc203
Compare
Choose a tag to compare

What's Changed

  • Refactor of PySRRegressor by @tttc3 in #146
    • PySRRegressor is now completely compatible with scikit-learn.
    • PySRRegressor can be stored in a pickle file, even after fitting, and then be reloaded and used with .predict()
    • PySRRegressor.equations -> PySRRegressor.equations_

New Contributors

Full Changelog: v0.8.7...v0.9.0

v0.8.5

20 May 14:58
69aa240
Compare
Choose a tag to compare

What's Changed

  • Custom complexities for operators, constants, and variables (#138)
  • Early stopping conditions (#134)
    • Based on a certain loss value being achieved
    • Max number of evaluations (for theoretical studies of genetic algorithms, rather than anything practical).
  • Work with specified expression rather than the one given by model_selection, by passing index to the function you wish to use (e.g,. model.predict(X, index=5) would use the 5th equation.).

Full Changelog since v0.8.1: v0.8.1...v0.8.5

v0.8.1

08 May 16:19
fc75036
Compare
Choose a tag to compare

What's Changed

  • Enable distributed processing with ClusterManagers.jl from #133

Full Changelog: v0.8.0...v0.8.1

v0.8.0

08 May 01:15
bfd7114
Compare
Choose a tag to compare

This new release updates the entire set of default PySR parameters according to the ones presented in #115. These parameters have been tuned over nearly 71,000 trials. See the discussion for further info.

Additional changes:

  • Nested constraints implemented. For example, you can now prevent sin and cos from being repeatedly nested, by using the argument: nested_constraints={"sin": {"sin": 0, "cos": 0}, "cos": {"sin": 0, "cos": 0}}. This argument states that within a sin operator, you can only have a max depth of 0 for other sin or cos. The same is done for cos. The argument nested_constraints={"^": {"+": 2, "*": 1, "^": 0}} states that within a pow operator, you can only have 2 things added, or 1 use of multiplication (i.e., no double products), and zero other pow operators. This helps a lot with finding interpretable expressions!
  • New parsimony algorithm (backend change). This seems to help searches quite a bit, especially when one is searching for more complex expressions. This is turned on by use_frequency_in_tournament which is now the default.
  • Many backend improvements: speed, bug fixes, etc.
  • Improved stability of multi-processing (backend change). Thanks to @CharFox1.
  • Auto-differentiation implemented (backend change). This isn't used by default in any instances right now, but could be used by optimization later. Thanks to @kazewong.
  • Improved testing coverage of weird edge cases.
  • All parameters to PySRRegressor have been cleaned up to be in snake_case rather than CamelCase. The backend is also now almost entirely snake_case for internal functions. +Other readability improvements. Thanks to @bstollnitz and @patrick-kidger for the suggestions.

v0.6.0

01 Jun 03:18
Compare
Choose a tag to compare

PySR Version 0.6.0

Large changes:

  • Exports to JAX, PyTorch, NumPy. All exports have a similar interface. JAX and PyTorch allow the equation parameters to be trained (e.g., as part of some differentiable model). Read https://pysr.readthedocs.io/en/latest/docs/options/#callable-exports-numpy-pytorch-jax for details. Thanks Patrick Kidger for the PyTorch export.
  • Multi-output y input is allowed, and the backend will efficiently batch over each output. A list of dataframes is returned by pysr for these cases. All best_* functions return a list as well.
  • BFGS optimizer introduced + more stable parameter search due to back tracking line search.

Smaller changes since 0.5.16:

  • Expanded tests, coverage calculation for PySR
  • Improved (pre-processing) feature selection with random forest
  • New default parameters for search:
    • annealing=False (no annealing works better with the new code. This is equivalent to alpha=infinity)
    • useFrequency=True (deals with complexity in a smarter way)
    • npopulations = 20 procs*4
    • progress=True (show a progress bar)
    • optimizer_algorithm="BFGS"
    • optimizer_iterations=10
    • optimize_probability=1
    • binary_operators default = ["+", "-", "/", "*"]
    • unary_operators default = []
  • Warnings:
    • Using maxsize > 40 will trigger a warning mentioning how it will be slow and use a lot of memory. Will mention to turn off useFrequency, and perhaps also use warmupMaxsizeBy.
  • Deprecated nrestarts -> optimizer_nrestarts
  • Printing fixed in Jupyter

PySR v0.4.0

01 Feb 22:10
Compare
Choose a tag to compare

With versions v0.4.0/v0.4.0, SymbolicRegression.jl and PySR have now been completely disentangled: PySR is 100% Python code (with some Julia meta-programming), and SymbolicRegression.jl is 100% Julia code.

PySR now works by activating a Julia env that has SymbolicRegression.jl as a dependency, and making calls to it! By default it will set up a Julia project inside the pip install location, and install requirements at the user's confirmation, though you can pass an arbitrary project directory as well (e.g., if you want to use PySR but also tweak the backend). The nice thing about this is that for Python users, all you need to do is install a Julia binary somewhere, and they should be good to go. And for Julia users, you never need to touch the Python side.

The SymbolicRegression.jl backend also sets up workers automatically & internally now, so one never needs to call @everywhere when setting things up. The same is true even with locally-defined functions - these get passed to workers!

With PySR importing the latest Julia code, this also means it gets new simplification routines powered by SymbolicUtils.jl, which seem to help improve the equations discovered.

Fix process blocking

27 Sep 09:21
Compare
Choose a tag to compare

Populations don't block eachother, which gives a large speedup especially for large numbers of populations. This was fixed by using RemoteChannel() in Julia.

Some populations happen to take longer than others - perhaps they have very complex equations - and can therefore block others that have finished early. This lets the processor work on the next population to be finished.

Scoring with Pareto Front

27 Sep 00:39
Compare
Choose a tag to compare

Uses equation from Cranmer et al. (2020) https://arxiv.org/abs/2006.11287 to score equations, and prints this alongside MSE. This makes symbolic regression more robust to noise.

v0.2

21 Sep 14:14
Compare
Choose a tag to compare
v0.2 - Add many more operators; increase efficiency