Skip to content

v1.0.0-beta1

Pre-release
Pre-release
Compare
Choose a tag to compare
@MilesCranmer MilesCranmer released this 06 Oct 22:36
· 508 commits to master since this release
6e3cdf5

This is a beta release that is not yet registered. To try it out, open a Julia REPL and hit ], then:

pkg> add SymbolicRegression#v1.0.0-beta1

Before the final release of v1.0.0, the hyperparameters will be re-tuned to optimize the new mutations: swap_operands and rotate_tree, which seem to be quite effective.

Major Changes

Breaking: Changes default expressions from Node to the user-friendly Expression

#326

This is a breaking change in the format of expressions returned by SymbolicRegression. Now, instead of returning a Node{T}, SymbolicRegression will return a Expression{T,Node{T},...} (both from equation_search and from report(mach).equations). This type is much more convenient and high-level than the Node type, as it includes metadata relevant for the node, such as the operators and variable names.

This means you can reliably do things like:

using SymbolicRegression: Options, Expression, Node

options = Options(binary_operators=[+, -, *, /], unary_operators=[cos, exp, sin])
operators = options.operators
variable_names = ["x1", "x2", "x3"]
x1, x2, x3 = [Expression(Node(Float64; feature=i); operators, variable_names) for i=1:3]

# Use the operators directly!
tree = cos(x1 - 3.2 * x2) - x1 * x1

You can then do operations with this tree, without needing to track operators:

println(tree)  # Looks up the right operators based on internal metadata

X = randn(3, 100)

tree(X)  # Call directly!
tree'(X)  # gradients of expression

Each time you use an operator on or between two Expressions that include the operator in its list, it will look up the right enum index, and create the correct Node, and then return a new Expression.

You can access the tree with get_tree (guaranteed to return a Node), or get_contents – which returns the full info of an AbstractExpression, which might contain multiple expressions (which get stitched together when calling get_tree).

Customizing behavior

DynamicExpressions v1.0 has a full AbstractExpression interface to customize behavior of pretty much anything. As an example, there is this included ParametricExpression type, with an example available in examples/parametrized_function.jl. You can use this to find basis functions with per-class parameters. It still needs some tuning but it works for simple examples.

This ParametricExpression is meant partly as an example of the types of things you can do with the new AbstractExpression interface, though it should hopefully be a useful feature by itself.

Auto-diff within optimization

Historically, SymbolicRegression has mostly relied on finite differences to estimate derivatives – which actually works well for small numbers of parameters. This is what Optim.jl selects unless you can provide it with gradients.

However, with the introduction of ParametricExpressions, full support for autodiff-within-Optim.jl was needed. v1 includes support for some parts of DifferentiationInterface.jl, allowing you to actually turn on various automatic differentiation backends when optimizing constants. For example, you can use

Options(
    autodiff_backend=:Zygote,
)

to use Zygote.jl for autodiff during BFGS optimization, or even

Options(
    autodiff_backend=:Enzyme,
)

for Enzyme.jl (though Enzyme support is highly experimental).

Other Changes

  • Implement tree rotation operator by @MilesCranmer in #348
    • This seems to help search performance overall – the new mutation is available as rotate_tree in the weights – which has been set to a default 0.3.
  • Avoid Base.sleep by @MilesCranmer in #305
  • CompatHelper: bump compat for MLJModelInterface to 1, (keep existing compat) by @github-actions in #328
  • fix typos by @spaette in #331
  • chore(deps): bump peter-evans/create-pull-request from 6 to 7 by @dependabot in #343

New Contributors

Full Changelog: v0.24.5...v1.0.0-beta1