Releases · MilesCranmer/SymbolicRegression.jl

CompatHelper: bump compat for LossFunctions to 1, (keep existing compat) (#373) (@github-actions[bot])
Fix Options.jl docs (#375) (@wsmoses)
Fix use of logger in distributed mode (#376) (@MilesCranmer)

Closed issues:

Multi-expression objects and fixed functional forms (#193)

Contributors

wsmoses and MilesCranmer

Assets 2

28 Nov 01:39

github-actions

v1.0.3

1161293

v1.0.3

SymbolicRegression v1.0.3

Diff since v1.0.2

Merged pull requests:

feat: allow argument-less TemplateExpression parts (#372) (@MilesCranmer)
fix: predict for TemplateExpressions (#374) (@MilesCranmer)

Contributors

MilesCranmer

Assets 2

24 Nov 04:54

github-actions

v1.0.2

dedb41a

v1.0.2

SymbolicRegression v1.0.2

Diff since v1.0.1

Merged pull requests:

fix: widen type constraints for TemplateExpression evaluation (#371) (@MilesCranmer)

Contributors

MilesCranmer

Assets 2

20 Nov 20:55

github-actions

v1.0.1

4efe2b7

v1.0.1

SymbolicRegression v1.0.1

Diff since v1.0.0

Merged pull requests:

CompatHelper: bump compat for ConstructionBase to 1, (keep existing compat) (#357) (@github-actions[bot])
CompatHelper: bump compat for Optim to 1, (keep existing compat) (#369) (@github-actions[bot])

Assets 2

15 Nov 21:13

MilesCranmer

v1.0.0

aa7bab8

v1.0.0

SymbolicRegression.jl v1.0.0

Summary of major recent changes, described in more detail below:

Changed the core expression type from Node{T} → Expression{T,Node{T},Metadata{...}}
- This gives us new features, improves user hackability, and greatly improves ergonomics!
Created "Template Expressions", for fitting expressions under a user-specified functional form (TemplateExpression <: AbstractExpression)
- Template expressions are quite flexible: they are a meta-expression that wraps multiple other expressions, and combines them using a user-specified function.
- This enables vector expressions - in other words, you can learn multiple components of a vector, simultaneously, with a single expression! Or more generally, you can learn expressions onto any Julia struct.
  - (Note that this still does not permit learning using non-scalar operators, though we are working on that!)
- Template expressions also make use of colored strings to represent each part in the printout, to improve readability.
Created "Parametric Expressions", for custom functional forms with per-class parameters: (ParametricExpression <: AbstractExpression)
- This lets you fit expressions that act as models of multiple systems, with per-system parameters!
Introduced a variety of new abstractions for user extensibility (and to support new research on symbolic regression!)
- AbstractExpression, for increased flexibility in custom expression types.
- mutate! and AbstractMutationWeights, for user-defined mutation operators.
- AbstractSearchState, for holding custom metadata during searches.
- AbstractOptions and AbstractRuntimeOptions, for customizing pretty much everything else in the library via multiple dispatch. Please make an issue/PR if you would like any particular internal functions be declared public to enable stability across versions for your tool.
- Many of these were motivated to modularize the implementation of LaSR, an LLM-guided version of SymbolicRegression.jl, so it can sit as a modular layer on top of SymbolicRegression.jl.
Added TensorBoardLogger.jl and other logging integrations via SRLogger
Support for Zygote.jl and Enzyme.jl within the constant optimizer, specified using the autodiff_backend option
Other changes:
- Fundamental improvements to the underlying evolutionary algorithm
  - New mutation operators introduced, swap_operands and rotate_tree – both of which seem to help kick the evolution out of local optima.
  - New hyperparameter defaults created, based on a Pareto front volume calculation, rather than simply accuracy of the best expression.
- Changed output file handling
- Major refactoring of the codebase to improve readability and modularity
- Identified and fixed a major internal bug involving unexpected aliasing produced by the crossover operator
  - Segmentation faults caused by this are a likely culprit for some crashes reported during multi-day multi-node searches.
  - Introduced a new test for aliasing throughout the entire search state to prevent this from happening again.
- Improved progress bar and StyledStrings integration.
- Julia 1.10 is now the minimum supported Julia version.
- Other small features
- Also see the "Update Guide" below for more details on upgrading.
- New URL: https://ai.damtp.cam.ac.uk/symbolicregression

Note that some of these features were recently introduced in patch releases since they were backwards compatible. I am noting them here for visibility.

1. Changed the core expression type from `Node{T} → Expression{T,Node{T},...}`

#326

This is a breaking change in the format of expressions returned by SymbolicRegression. Now, instead of returning a Node{T}, SymbolicRegression will return a Expression{T,Node{T},...} (both from equation_search and from report(mach).equations). This type is much more convenient and high-level than the Node type, as it includes metadata relevant for the node, such as the operators and variable names.

This means you can reliably do things like:

using SymbolicRegression: Options, Expression, Node

options = Options(binary_operators=[+, -, *, /], unary_operators=[cos, exp, sin])
operators = options.operators
variable_names = ["x1", "x2", "x3"]
x1, x2, x3 = [Expression(Node(Float64; feature=i); operators, variable_names) for i=1:3]

## Use the operators directly!
tree = cos(x1 - 3.2 * x2) - x1 * x1

You can then do operations with this tree, without needing to track operators:

println(tree)  # Looks up the right operators based on internal metadata

X = randn(3, 100)

tree(X)  # Call directly!
tree'(X)  # gradients of expression

Each time you use an operator on or between two Expressions that include the operator in its list, it will look up the right enum index, and create the correct Node, and then return a new Expression.

You can access the tree with get_tree (guaranteed to return a Node), or get_contents – which returns the full info of an AbstractExpression, which might contain multiple expressions (which get stitched together when calling get_tree).

2. Created "Template Expressions", for fitting expressions under a user-specified functional form (`TemplateExpression <: AbstractExpression`)

Template Expressions allow users to define symbolic expressions with a fixed structure, combining multiple sub-expressions under user-specified constraints. This is particularly useful for symbolic regression tasks where domain-specific knowledge or constraints must be imposed on the model's structure.

This also lets you fit vector expressions using SymbolicRegression.jl, where vector components can also be shared!

A TemplateExpression is constructed by specifying:

A named tuple of sub-expressions (e.g., (; f=x1 - x2 * x2, g=1.5 * x3)).
A structure function that defines how these sub-expressions are combined in different contexts.

For example, you can create a TemplateExpression that enforces the constraint: sin(f(x1, x2)) + g(x3)^2 - where we evolve f and g simultaneously.

To do this, we first describe the structure using TemplateStructure that takes a single closure function that maps a named tuple of ComposableExpression expressions and a tuple of features:

using SymbolicRegression

structure = TemplateStructure{(:f, :g)}(
  ((; f, g), (x1, x2, x3)) -> sin(f(x1, x2)) + g(x3)^2
)

This defines how the TemplateExpression should be evaluated numerically on a given input.

The number of arguments allowed by each expression object is inferred using this closure, though it can also be passed explicitly with the num_features kwarg.

operators = Options(binary_operators=(+, -, *, /)).operators
variable_names = ["x1", "x2", "x3"]
x1 = ComposableExpression(Node{Float64}(; feature=1); operators, variable_names)
x2 = ComposableExpression(Node{Float64}(; feature=2); operators, variable_names)
x3 = ComposableExpression(Node{Float64}(; feature=3); operators, variable_names)

Note that using x1 here refers to the relative argument to the expression. So the node with feature equal to 1 will reference the first argument, regardless of what it is.

st_expr = TemplateExpression(
    (; f=x1 - x2 * x2, g=1.5 * x1);
    structure,
    operators,
    variable_names
) # Prints as: f = #1 - (#2 * #2); g = 1.5 * #1

# Evaluation combines evaluation of `f` and `g`, and combines them
# with the structure function:
st_expr([0.0; 1.0; 2.0;;])

This also work with hierarchical expressions! For example,

structure = TemplateStructure{(:f, :g)}(
  ((; f, g), (x1, x2, x3)) -> f(x1, g(x2), x3^2) - g(x3)
)

this is a valid structure!

We can also use this TemplateExpression in SymbolicRegression.jl searches!

For example, say that we want to fit *vector expressions*:

using SymbolicRegression
using MLJBase: machine, fit!, report

We first define our structure. This also has our variable mapping, which says we are fitting f(x1, x2), g1(x3), and g2(x3):

function my_structure((; f, g1, g2), (x1, x2, x3))
    _f = f(x1, x2)
    _g1 = g1(x3)
    _g2 = g2(x3)

    # We use `.x` to get the underlying vector
    out = map((fi, g1i, g2i) -> (fi + g1i, fi + g2i), _f.x, _g1.x, _g2.x)
    # And `.valid` to see whether the evaluations
    return ValidVector(out, _f.valid && _g1.valid && _g2.valid)
end
structure = TemplateStructure{(:f, :g1, :g2)}(my_structure)

Now, our dataset is a regular 2D array of inputs for X. But our y is actually a vector of 2-tuples!

X = rand(100, 3) .* 10

y = [
    (sin(X[i, 1]) + X[i, 3]^2, sin(X[i, 1]) + X[i, 3])
    for i in eachindex(axes(X, 1))
]

Now, since this is a vector-valued expression, we need to specify a custom elementwise_loss function:

elementwise_loss = ((x1, x2), (y1, y2)) -> (y1 - x1)^2 + (y2 - x2)^2

This reduces y and the predicted value of y returned by the structure function.

Our regressor is then:

model = SRRegressor(;
    binary_operators=(+, *),
    unary_operators=(sin,),
    maxsize=15,
    elementwise_loss=elementwise_loss,
    expression_type=TemplateExpression,
    # Note - this is where we pass custom options to the expression type:
    expression_options=(; structure),
)

mach = machine(model, X, y)
fit!(mach)

Let's see the performance of the model:

report(mach)

We can also check the expression is split up correc...

Assets 2

09 Nov 20:41

MilesCranmer

v1.0.0-beta4

92f2b33

v1.0.0-beta4 Pre-release

Pre-release

What's Changed

Integration with TensorBoard and other logging utilities by @MilesCranmer in #277

Full Changelog: v1.0.0-beta3...v1.0.0-beta4

Contributors

MilesCranmer

Assets 2

07 Nov 06:49

MilesCranmer

v1.0.0-beta3

cf4c0c2

v1.0.0-beta3 Pre-release

Pre-release

What's Changed

Rewrite TemplateExpression to enable hierarchical expressions by @MilesCranmer in #365

Full Changelog: v1.0.0-beta2...v1.0.0-beta3

Contributors

MilesCranmer

Assets 2

30 Oct 18:45

MilesCranmer

v1.0.0-beta2

bc9edaf

v1.0.0-beta2 Pre-release

Pre-release

What's Changed

Deprecate Julia 1.9 by @MilesCranmer in #354
Create overloadable utilities: AbstractOptions, AbstractRuntimeOptions, AbstractMutationWeights, AbstractSearchState, and mutate! by @MilesCranmer in #353
Create TemplateExpression for providing a pre-defined functional structure and constraints by @MilesCranmer in #355
Output folder, better TemplateExpression, colored printouts, switch to ProgressMeter by @MilesCranmer in #360

Full Changelog: v1.0.0-beta1...v1.0.0-beta2

Contributors

MilesCranmer

Assets 2

06 Oct 22:36

MilesCranmer

v1.0.0-beta1

6e3cdf5

v1.0.0-beta1 Pre-release

Pre-release

This is a beta release that is not yet registered. To try it out, open a Julia REPL and hit ], then:

pkg> add SymbolicRegression#v1.0.0-beta1

Before the final release of v1.0.0, the hyperparameters will be re-tuned to optimize the new mutations: swap_operands and rotate_tree, which seem to be quite effective.

Major Changes

Breaking: Changes default expressions from `Node` to the user-friendly `Expression`

#326

This means you can reliably do things like:

using SymbolicRegression: Options, Expression, Node

options = Options(binary_operators=[+, -, *, /], unary_operators=[cos, exp, sin])
operators = options.operators
variable_names = ["x1", "x2", "x3"]
x1, x2, x3 = [Expression(Node(Float64; feature=i); operators, variable_names) for i=1:3]

# Use the operators directly!
tree = cos(x1 - 3.2 * x2) - x1 * x1

You can then do operations with this tree, without needing to track operators:

println(tree)  # Looks up the right operators based on internal metadata

X = randn(3, 100)

tree(X)  # Call directly!
tree'(X)  # gradients of expression

Customizing behavior

DynamicExpressions v1.0 has a full AbstractExpression interface to customize behavior of pretty much anything. As an example, there is this included ParametricExpression type, with an example available in examples/parametrized_function.jl. You can use this to find basis functions with per-class parameters. It still needs some tuning but it works for simple examples.

This ParametricExpression is meant partly as an example of the types of things you can do with the new AbstractExpression interface, though it should hopefully be a useful feature by itself.

Auto-diff within optimization

Historically, SymbolicRegression has mostly relied on finite differences to estimate derivatives – which actually works well for small numbers of parameters. This is what Optim.jl selects unless you can provide it with gradients.

However, with the introduction of ParametricExpressions, full support for autodiff-within-Optim.jl was needed. v1 includes support for some parts of DifferentiationInterface.jl, allowing you to actually turn on various automatic differentiation backends when optimizing constants. For example, you can use

Options(
    autodiff_backend=:Zygote,
)

to use Zygote.jl for autodiff during BFGS optimization, or even

Options(
    autodiff_backend=:Enzyme,
)

for Enzyme.jl (though Enzyme support is highly experimental).

Other Changes

Implement tree rotation operator by @MilesCranmer in #348
- This seems to help search performance overall – the new mutation is available as rotate_tree in the weights – which has been set to a default 0.3.
Avoid Base.sleep by @MilesCranmer in #305
CompatHelper: bump compat for MLJModelInterface to 1, (keep existing compat) by @github-actions in #328
fix typos by @spaette in #331
chore(deps): bump peter-evans/create-pull-request from 6 to 7 by @dependabot in #343

New Contributors

@spaette made their first contribution in #331
Thanks to @larsentom for the mutation idea

Full Changelog: v0.24.5...v1.0.0-beta1

Contributors

MilesCranmer, dependabot, and 2 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SymbolicRegression v1.2.0

Contributors

SymbolicRegression v1.1.0

Contributors

SymbolicRegression v1.0.3

Contributors

SymbolicRegression v1.0.2

Contributors

SymbolicRegression v1.0.1

SymbolicRegression.jl v1.0.0

1. Changed the core expression type from `Node{T} → Expression{T,Node{T},...}`

2. Created "Template Expressions", for fitting expressions under a user-specified functional form (`TemplateExpression <: AbstractExpression`)

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Major Changes

Breaking: Changes default expressions from `Node` to the user-friendly `Expression`

Customizing behavior

Auto-diff within optimization

Other Changes

New Contributors

Contributors

Releases: MilesCranmer/SymbolicRegression.jl

v1.2.0

SymbolicRegression v1.2.0

Contributors

v1.1.0

SymbolicRegression v1.1.0

Contributors

v1.0.3

SymbolicRegression v1.0.3

Contributors

v1.0.2

SymbolicRegression v1.0.2

Contributors

v1.0.1

SymbolicRegression v1.0.1

v1.0.0

SymbolicRegression.jl v1.0.0

1. Changed the core expression type from Node{T} → Expression{T,Node{T},...}

2. Created "Template Expressions", for fitting expressions under a user-specified functional form (TemplateExpression <: AbstractExpression)

v1.0.0-beta4

What's Changed

Contributors

v1.0.0-beta3

What's Changed

Contributors

v1.0.0-beta2

What's Changed

Contributors

v1.0.0-beta1

Major Changes

Breaking: Changes default expressions from Node to the user-friendly Expression

Customizing behavior

Auto-diff within optimization

Other Changes

New Contributors

Contributors

1. Changed the core expression type from `Node{T} → Expression{T,Node{T},...}`

2. Created "Template Expressions", for fitting expressions under a user-specified functional form (`TemplateExpression <: AbstractExpression`)

Breaking: Changes default expressions from `Node` to the user-friendly `Expression`