-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Easier auto-differentiation support? #128
Comments
Another idea to throw out there. We have an abstraction for NLP solvers in MathOptInterface with multiple backends, e.g., Ipopt and Knitro. (NLopt has a wrapper for the old MathProgBase API but not yet MathOptInterface.) If you target your adapters to MathOptInterface instead of NLopt directly then you get multiple backends without much extra cost. |
@mlubin I really hope that NLopt gets up with the MOI soon so that I can swap out Ipopt when using juMP Maybe I am misunderstanding, but if you are suggesting that given MOI we would instead use it directly for calling the optimizer, I am not sure I see it as it currently stands. The complexity of MOI is necessary to drive a EDSL like JuMP, but it is very heavy for those of us that just want to call a nonlinear solver and pass in arbitrary functions of vectors. Perhaps JuMP will become a little more friendly for those who want nonlinear solvers without a DSL, or perhaps there is another interface built on MOI for people like me? Just so you know, the total code required to add Optim-style AD support to NLsolve is just using NLSolversBase
struct NLoptAdapter{T} <: Function where T <: AbstractObjective
nlsolver_base::T
end
# implement fg!; note that the order is reversed
(adapter::NLoptAdapter)(x, df) = adapter.nlsolver_base.fdf(df, x)
(adapter::NLoptAdapter)(result, x, jacobian_transpose) = adapter.nlsolver_base.fdf(result, jacobian_transpose', x)
# constructors
NLoptAdapter(f, x, autodiff = :forward) = NLoptAdapter(OnceDifferentiable(f, x, autodiff = autodiff))
NLoptAdapter(f!, x::Vector, F::Vector, autodiff = :forward) = NLoptAdapter(OnceDifferentiable(f!, x, F, autodiff = autodiff)) |
Something along these lines seems reasonable to me. |
Great @arnavs will prepare a full PR for this with the necessary tests, documentation, etc. At this point, we think that this will require no direct changes to the NLopt code itself (though you may have ideas of how it could be reorganized). By using NLSolversBase as a dependency, we may be able to hook in other AD implementations if they are added to that library (@arnavs may also try to add Flux.jl or Zygote.jl support to NLSolversBase in a separate issue). (1) We don't add in exported wrapper type and instead add in arguments to the functions themselves? i.e. add a method shadowing the existing ones so it would look like: # A setup from our existing tests
M = 5000
x0 = fill(1.0, M)
m = range(2, 5, length = M)
f(x) = sum((x .- m).^2) + sum(log.(x))
# define the optimization problem
opt = Opt(:LD_LBFGS, M)
min_objective!(opt, f, autodiff=:forward) The list of additional methods would be something like min_objective!(opt::Opt, f::Function, autodiff)
max_objective!(opt::Opt, f::Function, autodiff)
inequality_constraint!(opt::Opt, fc, tol=0, autodiff)
equality_constraint!(opt::Opt, h, tol=0, autodiff)
inequality_constraint!(opt::Opt, c, tol::AbstractVector, autodiff)
equality_constraint!(opt::Opt, c, tol::AbstractVector, autodiff) where (2) Alternatively, the less invasive one is to have a pubically exposed wrapper: # A setup from our existing tests
M = 5000
x0 = fill(1.0, M)
m = range(2, 5, length = M)
f(x) = sum((x .- m).^2) + sum(log.(x))
# define the optimization problem
opt = Opt(:LD_LBFGS, M)
f_opt = NLoptDifferentiable(f, x0) # The only new code!
min_objective!(opt, f_opt) For the name of the adapter:
An alternative is that we don't add in exported wrapper type and instead add in arguments to the functions themselves? i.e. add a method @stevengj If you have no strong opinions, I think we should first try to add it directly in the interfaces on those functions instead of wrappers? That would be the easiest for end-users. |
@arnavs: A few notes:
|
This thread is moving in the direction of adding direct support for AD to NLopt which is fine with me. But anyway to address the point on MOI,
Yes, I was suggesting to call the optimizers via MOI. Yes, there are a few more lines of code required, but if you're already in the business of designing APIs for nonlinear optimization (as discussed in this thread), then I don't expect it to be a significant challenge. There are a number of people familiar with the MOI API who can answer questions at https://gitter.im/JuliaOpt/JuMP-dev. I also have an old demo notebook of connecting Newton's method to MathProgBase: https://nbviewer.jupyter.org/github/JuliaOpt/juliaopt-notebooks/blob/master/notebooks/MathProgBase%20Newton%20solver.ipynb. The API for interacting with derivatives is mostly unchanged from MPB to MOI. The main extra things that MOI supports over the Optim/NLopt-type interfaces are:
I understand that a lot of use cases for nonlinear optimization don't need to think about sparse hessian matrices or linear constraints. Ultimately it's up to you to decide if the extra complexity of the abstraction is worth it.
The JuliaSmoothOptimizers organization has built a good amount of infrastructure that could be useful. As far as I know it hasn't been updated for MOI yet though. In any case, the best way for a simpler NLP interface to be built on top of MOI is for someone invested in the result to write the code. It's a question of a couple days, not weeks or months. |
We can discuss further in slack to see if a NLP interface based on MOI is possible and if there are people who could help design it. I am not sure I have access to people with sufficient expertise (BTW, you had previously mentioned JuliaSmoothOptimizers to me, but I never was able to navigate it to see the interface they were proposing). |
Rather than Then you are just adding a new |
something along these lines? (i use this to add forward differenciation support in my code)
using that as a base,
the problem to adding this to NLsolversBase is that the check for the existence of the gradient is different. here checks for length, Optim checks for nothing) |
an improved version, works with the future NLsolvers and NLopt
|
Just as an update here, I found the above to not be working (perhaps it did when posted). Here is what I found to work: function autodiff(fn)
# adapted from here https://github.com/JuliaOpt/NLopt.jl/issues/128
function f(x)
return fn(x)
end
function f(x,∇f)
if !(∇f == nothing) && (length(∇f) != 0)
ForwardDiff.gradient!(∇f,fn,x)
end
fx = fn(x)
return fx
end
return f
end |
I use this in practice: import AbstractDifferentiation as AD
function gradient_on_demand(ad_backend, f)
function(x, ∇fx)
if isempty(∇fx)
f(x)
else
v, (g,) = AD.value_and_gradient(ad_backend, f, x)
copy!(∇fx, g)
v
end
end
end Example: using NLopt, ForwardDiff
ab = AD.ForwardDiffBackend()
opt = Opt(:LD_SLSQP, 3)
equality_constraint!(opt, gradient_on_demand(ab, x -> sum(x) - 3), 1e-4)
opt.min_objective = gradient_on_demand(ab, x -> sum(abs2, x))
(minf, minx, ret) = optimize(opt, [-1.0, 1.5, 9.9]) Would be happy to make a PR, AbstractDifferentiation is a not a heavy dependency (per se). And Enzyme support is WIP. |
I'm not convinced that NLopt.jl needs AD, but I can see why this is useful for users. The Julia ecosystem has still not settled on the-one-AD-system-to-rule-them-all, and prior to that, it seems premature to add one to NLopt. (Especially since we're now v1.0.3, so any new dependency needs to be backwards compatible for the foreseeable future.) Perhaps all this needs is some documentation on how to use the different AD systems with NLopt? (Similar to https://jump.dev/JuMP.jl/stable/tutorials/nonlinear/operator_ad/) |
I have added a section to the README to document how to use third-party AD packages: This seems sufficient. We don't need to complicate NLopt. |
Right now you need to manually setup auto-differentiation for functions which is more cumbersome than others (e.g. Optim or JuMP directly). Would you be willing to entertain a PR to make this a little easier to work with?
You can see https://github.com/ubcecon/computing_and_datascience/blob/master/julia_tutorials/nlopt/nlopt-tutorial.ipynb for an example of what it might look like. Basically, you could just call an adapter on functions and pass that object directly into your existing functions.
The downside is that this would require adding https://github.com/JuliaNLSolvers/NLSolversBase.jl as a dependency with the following sub-dependencies: https://github.com/JuliaNLSolvers/NLSolversBase.jl/blob/master/REQUIRE
The text was updated successfully, but these errors were encountered: