Releases: MilesCranmer/PySR
v0.9.0
What's Changed
- Refactor of PySRRegressor by @tttc3 in #146
- PySRRegressor is now completely compatible with scikit-learn.
- PySRRegressor can be stored in a pickle file, even after fitting, and then be reloaded and used with
.predict()
PySRRegressor.equations
->PySRRegressor.equations_
New Contributors
Full Changelog: v0.8.7...v0.9.0
v0.8.5
What's Changed
- Custom complexities for operators, constants, and variables (#138)
- Early stopping conditions (#134)
- Based on a certain loss value being achieved
- Max number of evaluations (for theoretical studies of genetic algorithms, rather than anything practical).
- Work with specified expression rather than the one given by
model_selection
, by passingindex
to the function you wish to use (e.g,.model.predict(X, index=5)
would use the 5th equation.).
Full Changelog since v0.8.1: v0.8.1...v0.8.5
v0.8.1
What's Changed
- Enable distributed processing with ClusterManagers.jl from #133
Full Changelog: v0.8.0...v0.8.1
v0.8.0
This new release updates the entire set of default PySR parameters according to the ones presented in #115. These parameters have been tuned over nearly 71,000 trials. See the discussion for further info.
Additional changes:
- Nested constraints implemented. For example, you can now prevent
sin
andcos
from being repeatedly nested, by using the argument:nested_constraints={"sin": {"sin": 0, "cos": 0}, "cos": {"sin": 0, "cos": 0}}
. This argument states that within asin
operator, you can only have a max depth of 0 for othersin
orcos
. The same is done forcos
. The argumentnested_constraints={"^": {"+": 2, "*": 1, "^": 0}}
states that within a pow operator, you can only have 2 things added, or 1 use of multiplication (i.e., no double products), and zero other pow operators. This helps a lot with finding interpretable expressions! - New parsimony algorithm (backend change). This seems to help searches quite a bit, especially when one is searching for more complex expressions. This is turned on by
use_frequency_in_tournament
which is now the default. - Many backend improvements: speed, bug fixes, etc.
- Improved stability of multi-processing (backend change). Thanks to @CharFox1.
- Auto-differentiation implemented (backend change). This isn't used by default in any instances right now, but could be used by optimization later. Thanks to @kazewong.
- Improved testing coverage of weird edge cases.
- All parameters to PySRRegressor have been cleaned up to be in snake_case rather than CamelCase. The backend is also now almost entirely snake_case for internal functions. +Other readability improvements. Thanks to @bstollnitz and @patrick-kidger for the suggestions.
v0.6.0
PySR Version 0.6.0
Large changes:
- Exports to JAX, PyTorch, NumPy. All exports have a similar interface. JAX and PyTorch allow the equation parameters to be trained (e.g., as part of some differentiable model). Read https://pysr.readthedocs.io/en/latest/docs/options/#callable-exports-numpy-pytorch-jax for details. Thanks Patrick Kidger for the PyTorch export.
- Multi-output
y
input is allowed, and the backend will efficiently batch over each output. A list of dataframes is returned by pysr for these cases. Allbest_*
functions return a list as well. - BFGS optimizer introduced + more stable parameter search due to back tracking line search.
Smaller changes since 0.5.16:
- Expanded tests, coverage calculation for PySR
- Improved (pre-processing) feature selection with random forest
- New default parameters for search:
- annealing=False (no annealing works better with the new code. This is equivalent to alpha=infinity)
- useFrequency=True (deals with complexity in a smarter way)
- npopulations = 20
procs*4 - progress=True (show a progress bar)
- optimizer_algorithm="BFGS"
- optimizer_iterations=10
- optimize_probability=1
- binary_operators default = ["+", "-", "/", "*"]
- unary_operators default = []
- Warnings:
- Using maxsize > 40 will trigger a warning mentioning how it will be slow and use a lot of memory. Will mention to turn off
useFrequency
, and perhaps also usewarmupMaxsizeBy
.
- Using maxsize > 40 will trigger a warning mentioning how it will be slow and use a lot of memory. Will mention to turn off
- Deprecated nrestarts -> optimizer_nrestarts
- Printing fixed in Jupyter
PySR v0.4.0
With versions v0.4.0/v0.4.0, SymbolicRegression.jl and PySR have now been completely disentangled: PySR is 100% Python code (with some Julia meta-programming), and SymbolicRegression.jl is 100% Julia code.
PySR now works by activating a Julia env that has SymbolicRegression.jl as a dependency, and making calls to it! By default it will set up a Julia project inside the pip install location, and install requirements at the user's confirmation, though you can pass an arbitrary project directory as well (e.g., if you want to use PySR but also tweak the backend). The nice thing about this is that for Python users, all you need to do is install a Julia binary somewhere, and they should be good to go. And for Julia users, you never need to touch the Python side.
The SymbolicRegression.jl backend also sets up workers automatically & internally now, so one never needs to call @everywhere
when setting things up. The same is true even with locally-defined functions - these get passed to workers!
With PySR importing the latest Julia code, this also means it gets new simplification routines powered by SymbolicUtils.jl, which seem to help improve the equations discovered.
Fix process blocking
Populations don't block eachother, which gives a large speedup especially for large numbers of populations. This was fixed by using RemoteChannel() in Julia.
Some populations happen to take longer than others - perhaps they have very complex equations - and can therefore block others that have finished early. This lets the processor work on the next population to be finished.
Scoring with Pareto Front
Uses equation from Cranmer et al. (2020) https://arxiv.org/abs/2006.11287 to score equations, and prints this alongside MSE. This makes symbolic regression more robust to noise.
v0.2
v0.2 - Add many more operators; increase efficiency