Nonlinear Modeling
JuMP has support for nonlinear (convex and nonconvex) optimization problems. JuMP is able to automatically provide exact, sparse second-order derivatives to solvers. This information can improve solver accuracy and performance.
Set a nonlinear objective
Use @objective
to set a nonlinear objective.
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> @objective(model, Min, exp(x[1]) - sqrt(x[2]))
exp(x[1]) - sqrt(x[2])
To modify a nonlinear objective, call @objective
again.
Add a nonlinear constraint
Use @constraint
to add a nonlinear constraint.
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> @constraint(model, exp(x[1]) <= 1)
exp(x[1]) - 1.0 ≤ 0
julia> @constraint(model, con[i = 1:2], 2^x[i] >= i)
2-element Vector{ConstraintRef{Model, MathOptInterface.ConstraintIndex{MathOptInterface.ScalarNonlinearFunction, MathOptInterface.GreaterThan{Float64}}, ScalarShape}}:
con[1] : (2.0 ^ x[1]) - 1.0 ≥ 0
con[2] : (2.0 ^ x[2]) - 2.0 ≥ 0
Delete a nonlinear constraint using delete
:
julia> delete(model, con[1])
Add a parameter
Some solvers have explicit support for parameters, which are constants in the model that can be efficiently updated between solves.
JuMP implements parameters by a decision variable constrained on creation to the Parameter
set.
julia> model = Model();
julia> @variable(model, x);
julia> @variable(model, p[i = 1:2] in Parameter(i))
2-element Vector{VariableRef}:
p[1]
p[2]
julia> parameter_value(p[1])
1.0
julia> set_parameter_value(p[1], 3.5)
julia> @objective(model, Max, log(p[1] * x + p[2]))
log(p[1]*x + p[2])
See Parameters for more information on how to create and manage parameters.
Parameters are most useful when solving nonlinear models in a sequence:
julia> using JuMP, Ipopt
julia> model = Model(Ipopt.Optimizer);
julia> set_silent(model)
julia> @variable(model, x)
x
julia> @variable(model, p in Parameter(1.0))
p
julia> @objective(model, Min, (x - p)^2)
x² - 2 p*x + p²
julia> optimize!(model)
julia> value(x)
1.0
julia> set_parameter_value(p, 5.0)
julia> optimize!(model)
julia> value(x)
5.0
Using parameters can be faster than creating a new model from scratch with updated data because JuMP is able to avoid repeating a number of steps in processing the model before handing it off to the solver.
Create a nonlinear expression
Use @expression
to create nonlinear expression objects:
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> expr = @expression(model, exp(x[1]) + sqrt(x[2]))
exp(x[1]) + sqrt(x[2])
julia> my_anon_expr = @expression(model, [i = 1:2], sin(x[i]))
2-element Vector{NonlinearExpr}:
sin(x[1])
sin(x[2])
julia> @expression(model, my_expr[i = 1:2], sin(x[i]))
2-element Vector{NonlinearExpr}:
sin(x[1])
sin(x[2])
A NonlinearExpr
can be used in @objective
, @constraint
, and even nested in other @expression
s.
julia> @objective(model, Min, expr^2 + 1)
((exp(x[1]) + sqrt(x[2])) ^ 2.0) + 1.0
julia> @constraint(model, [i = 1:2], my_expr[i] <= i)
2-element Vector{ConstraintRef{Model, MathOptInterface.ConstraintIndex{MathOptInterface.ScalarNonlinearFunction, MathOptInterface.LessThan{Float64}}, ScalarShape}}:
sin(x[1]) - 1.0 ≤ 0
sin(x[2]) - 2.0 ≤ 0
julia> @expression(model, nested[i = 1:2], sin(my_expr[i]))
2-element Vector{NonlinearExpr}:
sin(sin(x[1]))
sin(sin(x[2]))
Use value
to query the value of a nonlinear expression:
julia> set_start_value(x[1], 1.0)
julia> value(start_value, nested[1])
0.7456241416655579
julia> sin(sin(1.0))
0.7456241416655579
Automatic differentiation
JuMP computes first- and second-order derivatives using sparse reverse-mode automatic differentiation. For details, see ReverseAD.
For a tutorial on how to construct and query the derivatives, see Computing Hessians
Nonlinear expressions in detail
Nonlinear expressions in JuMP are represented by a NonlinearExpr
object.
Constructors
Nonlinear expressions can be created using the NonlinearExpr
constructors:
julia> model = Model();
julia> @variable(model, x);
julia> expr = NonlinearExpr(:sin, Any[x])
sin(x)
or via operator overloading:
julia> model = Model();
julia> @variable(model, x);
julia> expr = sin(x)
sin(x)
Supported arguments
Nonlinear expressions can contain a mix of numbers, AffExpr
, QuadExpr
, and other NonlinearExpr
:
julia> model = Model();
julia> @variable(model, x);
julia> aff = x + 1;
julia> quad = x^2 + x;
julia> expr = cos(x) * sin(quad) + aff
(cos(x) * sin(x² + x)) + (x + 1)
Supported operators
The list of supported operators may vary between solvers. Given an optimizer, query the list of supported operators using MOI.ListOfSupportedNonlinearOperators
:
julia> import Ipopt
julia> import MathOptInterface as MOI
julia> MOI.get(Ipopt.Optimizer(), MOI.ListOfSupportedNonlinearOperators())
85-element Vector{Symbol}:
:+
:-
:abs
:sqrt
:cbrt
:abs2
:inv
:log
:log10
:log2
⋮
:min
:max
:&&
:||
:<=
:(==)
:>=
:<
:>
In some univariate cases, the operator is defined in SpecialFunctions.jl
. To use these functions, you must explicitly import SpecialFunctions.jl
julia> import Ipopt
julia> op = MOI.get(Ipopt.Optimizer(), MOI.ListOfSupportedNonlinearOperators());
julia> :erfcx in op
true
julia> :dawson in op
true
julia> import SpecialFunctions
julia> model = Model();
julia> @variable(model, x)
x
julia> @expression(model, SpecialFunctions.erfcx(x))
erfcx(x)
julia> @expression(model, SpecialFunctions.dawson(x))
dawson(x)
Limitations
Some nonlinear expressions cannot be created via operator overloading. For example, to minimize the likelihood of bugs in user-code, we have not overloaded comparisons such as <
and >=
between JuMP objects:
julia> model = Model();
julia> @variable(model, x);
julia> x < 1
ERROR: Cannot evaluate `<` between a variable and a number.
[...]
Instead, wrap the expression in the @expression
macro:
julia> model = Model();
julia> @variable(model, x);
julia> expr = @expression(model, x < 1)
x < 1
For technical reasons, other operators that are not overloaded include ||
, &&
, and ifelse
.
julia> model = Model();
julia> @variable(model, x);
julia> expr = @expression(model, ifelse(x < -1 || x >= 1, x^2, 0.0))
ifelse((x < -1) || (x >= 1), x², 0.0)
As an alternative, use the JuMP.op_
functions, which fallback to the various comparison and logical operators:
julia> model = Model();
julia> @variable(model, x);
julia> expr = op_ifelse(
op_or(op_strictly_less_than(x, -1), op_greater_than_or_equal_to(x, 1)),
x^2,
0.0,
)
ifelse((x < -1) || (x >= 1), x², 0.0)
The available functions are:
JuMP function | Julia function |
---|---|
op_ifelse | ifelse |
op_and | && |
op_or | || |
op_greater_than_or_equal_to | >= |
op_less_than_or_equal_to | <= |
op_equal_to | == |
op_strictly_greater_than | > |
op_strictly_less_than | < |
Fields
Each NonlinearExpr
has two fields.
The .head
field is a Symbol
that represents the operator being called:
julia> expr.head
:sin
The .args
field is a Vector{Any}
containing the arguments to the operator:
julia> expr.args
1-element Vector{Any}:
x
Forcing nonlinear expressions
The JuMP macros and operator overloading will preferentially build affine (GenericAffExpr
) and quadratic (GenericQuadExpr
) expressions instead of GenericNonlinearExpr
. For example:
julia> model = Model();
julia> @variable(model, x);
julia> f = (x - 0.1)^2
x² - 0.2 x + 0.010000000000000002
julia> typeof(f)
QuadExpr (alias for GenericQuadExpr{Float64, GenericVariableRef{Float64}})
To override this behavior, use the @force_nonlinear
macro:
julia> g = @force_nonlinear((x - 0.1)^2)
(x - 0.1) ^ 2
julia> typeof(g)
NonlinearExpr (alias for GenericNonlinearExpr{GenericVariableRef{Float64}})
Use this macro only if necessary. See the docstring of @force_nonlinear
for more details on when you should use it.
Function tracing
Nonlinear expressions can be constructed using function tracing. Function tracing is when you call a regular Julia function with JuMP variables as arguments and the function builds a nonlinear expression via operator overloading. For example:
julia> using JuMP
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> f(x::Vector{VariableRef}) = 2 * sin(x[1]^2) + sqrt(x[2])
f (generic function with 1 method)
julia> y = f(x)
(2.0 * sin(x[1]²)) + sqrt(x[2])
julia> typeof(y)
NonlinearExpr (alias for GenericNonlinearExpr{GenericVariableRef{Float64}})
julia> @objective(model, Max, f(x))
(2.0 * sin(x[1]²)) + sqrt(x[2])
Function tracing supports functions which return vectors or arrays of NonlinearExpr
:
julia> using JuMP
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> f(x::Vector{VariableRef}) = sqrt.(x)
f (generic function with 1 method)
julia> y = f(x)
2-element Vector{NonlinearExpr}: sqrt(x[1]) sqrt(x[2])
julia> typeof(y)
Vector{NonlinearExpr} (alias for Array{GenericNonlinearExpr{GenericVariableRef{Float64}}, 1})
julia> @constraint(model, f(x) .<= 2)
2-element Vector{ConstraintRef{Model, MathOptInterface.ConstraintIndex{MathOptInterface.ScalarNonlinearFunction, MathOptInterface.LessThan{Float64}}, ScalarShape}}: sqrt(x[1]) - 2.0 ≤ 0 sqrt(x[2]) - 2.0 ≤ 0
julia> @objective(model, Max, sum(f(x)))
0.0 + sqrt(x[2]) + sqrt(x[1])
Because function tracing uses operator overloading, there are many functions for which it will not work. For example:
julia> using JuMP
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> f(x::Vector{VariableRef}) = x[1] > 1 ? 0 : x[2]
f (generic function with 1 method)
julia> f(x)
ERROR: Cannot evaluate `>` between a variable and a number.
[...]
In these cases, you should define a User-defined operator using the @operator
macro.
User-defined operators
In addition to a standard list of univariate and multivariate operators recognized by the MOI.Nonlinear
submodule, JuMP supports user-defined operators, which let you represent nonlinear functions that cannot (or should not) be traced, for example, because they rely on non-Julia subroutines.
User-defined operators must return a scalar output. For a work-around, see User-defined operators with vector outputs.
Add an operator
Add a user-defined operator using the @operator
macro:
julia> using JuMP
julia> square(x) = x^2
square (generic function with 1 method)
julia> f(x, y) = (x - 1)^2 + (y - 2)^2
f (generic function with 1 method)
julia> model = Model();
julia> @operator(model, op_square, 1, square)
NonlinearOperator(square, :op_square)
julia> @operator(model, op_f, 2, f)
NonlinearOperator(f, :op_f)
julia> @variable(model, x[1:2]);
julia> @objective(model, Min, op_f(x[1], op_square(x[2])))
op_f(x[1], op_square(x[2]))
The arguments to @operator
are:
- The model to which the operator is added.
- A Julia symbol object which serves as the name of the user-defined operator in JuMP expressions. This name must not be the same as that of the function.
- The number of scalar input arguments that the function takes.
- A Julia method which computes the function.
User-defined operators cannot be deleted.
You can obtain a reference to the operator using the model[:key]
syntax:
julia> using JuMP
julia> square(x) = x^2
square (generic function with 1 method)
julia> model = Model();
julia> @operator(model, op_square, 1, square)
NonlinearOperator(square, :op_square)
julia> op_square_2 = model[:op_square]
NonlinearOperator(square, :op_square)
Automatic differentiation
JuMP computes first- and second-order derivatives of expressions using ReverseAD, which implements sparse reverse-mode automatic differentiation. However, because ReverseAD requires the algebraic expression as input, JuMP cannot use ReverseAD to differentiate user-defined operators.
Instead, unless Gradients and Hessians are explicitly provided, user-defined operators must support automatic differentiation by ForwardDiff.jl.
The use of FowardDiff.jl has two important implications:
- ForwardDiff.jl supports only a limited subset of Julia. If you encounter an error adding the operator, see Common mistakes when writing a user-defined operator.
- Differentiating operators with many arguments is slow. In general, you should try to keep the number of arguments to less than 100, and ideally, to less than 10.
Because of the use of ForwardDiff, in most cases, you should prefer to use function tracing instead of defining a user-defined operator.
Add an operator without macros
The @operator
macro is syntactic sugar for add_nonlinear_operator
. Thus, the non-macro version of the preceding example is:
julia> using JuMP
julia> square(x) = x^2
square (generic function with 1 method)
julia> f(x, y) = (x - 1)^2 + (y - 2)^2
f (generic function with 1 method)
julia> model = Model();
julia> op_square = add_nonlinear_operator(model, 1, square; name = :op_square)
NonlinearOperator(square, :op_square)
julia> model[:op_square] = op_square
NonlinearOperator(square, :op_square)
julia> op_f = add_nonlinear_operator(model, 2, f; name = :op_f)
NonlinearOperator(f, :op_f)
julia> model[:op_f] = op_f
NonlinearOperator(f, :op_f)
julia> @variable(model, x[1:2]);
julia> @objective(model, Min, op_f(x[1], op_square(x[2])))
op_f(x[1], op_square(x[2]))
Operators with the same name as an existing function
A common error encountered is the following:
julia> using JuMP
julia> model = Model();
julia> f(x) = x^2
f (generic function with 1 method)
julia> @operator(model, f, 1, f)
ERROR: Unable to add the nonlinear operator `:f` with the same name as
an existing function.
[...]
This error occurs because @operator(model, f, 1, f)
is equivalent to:
julia> f = add_nonlinear_operator(model, 1, f; name = :f)
but f
already exists as a Julia function.
If you evaluate the function without adding it as an operator, JuMP will trace the function using operator overloading:
julia> @variable(model, x);
julia> f(x)
x²
To force JuMP to treat f
as a user-defined operator and not trace it, add the operator using add_nonlinear_operator
and define a new method which manually creates a NonlinearExpr
:
julia> _ = add_nonlinear_operator(model, 1, f; name = :f)
NonlinearOperator(f, :f)
julia> f(x::AbstractJuMPScalar) = NonlinearExpr(:f, Any[x])
f (generic function with 2 methods)
julia> @expression(model, log(f(x)))
log(f(x))
Gradients and Hessians
By default, JuMP will use automatic differentiation to compute the gradient and Hessian of user-defined operators. If your function is not amenable to the default automatic differentiation, or you can compute analytic derivatives, you may pass additional arguments to @operator
to compute the first- and second-derivatives.
The tutorial Automatic differentiation of user-defined operators has examples of how to use third-party Julia packages to compute automatic derivatives.
Univariate functions
For univariate functions, a gradient function ∇f
returns a number that represents the first-order derivative. You may, in addition, pass a third function which returns a number representing the second-order derivative:
julia> using JuMP
julia> f(x) = x^2
f (generic function with 1 method)
julia> ∇f(x) = 2x
∇f (generic function with 1 method)
julia> ∇²f(x) = 2
∇²f (generic function with 1 method)
julia> model = Model();
julia> @operator(model, op_f, 1, f, ∇f, ∇²f) # Providing ∇²f is optional
NonlinearOperator(f, :op_f)
julia> @variable(model, x)
x
julia> @objective(model, Min, op_f(x))
op_f(x)
Multivariate functions
For multivariate functions, the gradient function ∇f
must take an AbstractVector
as the first argument that is filled in-place. The Hessian function, ∇²f
, must take an AbstractMatrix
as the first argument, the lower-triangular of which is filled in-place:
julia> using JuMP
julia> f(x...) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
f (generic function with 1 method)
julia> function ∇f(g::AbstractVector{T}, x::T...) where {T} g[1] = 400 * x[1]^3 - 400 * x[1] * x[2] + 2 * x[1] - 2 g[2] = 200 * (x[2] - x[1]^2) return end
∇f (generic function with 1 method)
julia> function ∇²f(H::AbstractMatrix{T}, x::T...) where {T} H[1, 1] = 1200 * x[1]^2 - 400 * x[2] + 2 # H[1, 2] = -400 * x[1] <-- Not needed. Fill the lower-triangular only. H[2, 1] = -400 * x[1] H[2, 2] = 200.0 return end
∇²f (generic function with 1 method)
julia> model = Model();
julia> @operator(model, rosenbrock, 2, f, ∇f, ∇²f) # Providing ∇²f is optional
NonlinearOperator(f, :rosenbrock)
julia> @variable(model, x[1:2])
2-element Vector{VariableRef}: x[1] x[2]
julia> @objective(model, Min, rosenbrock(x[1], x[2]))
rosenbrock(x[1], x[2])
You may assume the Hessian matrix H
is initialized with zeros, and because H
is symmetric, you need only to fill in the non-zero lower-triangular terms. The matrix type passed in as H
depends on the automatic differentiation system, so make sure the first argument to the Hessian function supports an AbstractMatrix
(it may be something other than Matrix{Float64}
). Moreover, you may assume only that H
supports size(H)
and setindex!
. Finally, the matrix is treated as dense, so the performance will be poor on functions with high-dimensional input.
User-defined operators with vector inputs
User-defined operators which take vectors as input arguments (for example, f(x::Vector)
) are not supported. Instead, use Julia's splatting syntax to create a function with scalar arguments. For example, instead of:
f(x::Vector) = sum(x[i]^i for i in 1:length(x))
define:
f(x...) = sum(x[i]^i for i in 1:length(x))
Another approach is to define the splatted function as an anonymous function:
julia> using JuMP
julia> model = Model();
julia> @variable(model, x[1:5])
5-element Vector{VariableRef}: x[1] x[2] x[3] x[4] x[5]
julia> f(x::Vector) = sum(x[i]^i for i in 1:length(x))
f (generic function with 1 method)
julia> @operator(model, op_f, 5, (x...) -> f(collect(x)))
NonlinearOperator(#6, :op_f)
julia> @objective(model, Min, op_f(x...))
op_f(x[1], x[2], x[3], x[4], x[5])
If the operator takes several vector inputs, write a function that takes the splatted arguments and reconstructs the required vector inputs:
julia> using JuMP
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> @variable(model, y[1:2]);
julia> @variable(model, z);
julia> f(x::Vector, y::Vector, z) = sum((x[i] * y[i])^z for i in 1:2)
f (generic function with 1 method)
julia> f(x, y, z)
((x[1]*y[1]) ^ z) + ((x[2]*y[2]) ^ z)
julia> f_splat(args...) = f(collect(args[1:2]), collect(args[3:4]), args[5])
f_splat (generic function with 1 method)
julia> f_splat(x..., y..., z)
((x[1]*y[1]) ^ z) + ((x[2]*y[2]) ^ z)
julia> @operator(model, op_f, 5, f_splat)
NonlinearOperator(f_splat, :op_f)
julia> @objective(model, Min, op_f(x..., y..., z))
op_f(x[1], x[2], y[1], y[2], z)
Common mistakes when writing a user-defined operator
JuMP uses ForwardDiff.jl to compute the first-order derivatives of user-defined operators. ForwardDiff has a number of limitations that you should be aware of when writing user-defined operators.
The rest of this section provides debugging advice and explains some common mistakes.
Get an error like No method matching Float64(::ForwardDiff.Dual)
? Read this section.
Debugging
If you add an operator that does not support ForwardDiff, a long error message will be printed. You can review the stacktrace for more information, but it can often be hard to understand why and where your function is failing.
It may be helpful to debug the operator outside of JuMP as follows.
If the operator is univariate, do:
julia> import ForwardDiff
julia> my_operator(a) = a^2
my_operator (generic function with 1 method)
julia> ForwardDiff.derivative(my_operator, 1.0)
2.0
If the operator is multivariate, do:
julia> import ForwardDiff
julia> my_operator(a, b) = a^2 + b^2
my_operator (generic function with 1 method)
julia> ForwardDiff.gradient(x -> my_operator(x...), [1.0, 2.0])
2-element Vector{Float64}:
2.0
4.0
Note that even though the operator takes the splatted arguments, ForwardDiff.gradient
requires a vector as input.
Operator calls something unsupported by ForwardDiff
ForwardDiff works by overloading many Julia functions for a special type ForwardDiff.Dual <: Real
. If your operator attempts to call a function for which an overload has not been defined, a MethodError
will be thrown.
For example, your operator cannot call external C functions, or be the optimal objective value of a JuMP model.
julia> import ForwardDiff
julia> my_operator_bad(x) = @ccall sqrt(x::Cdouble)::Cdouble
my_operator_bad (generic function with 1 method)
julia> ForwardDiff.derivative(my_operator_bad, 1.0)
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(my_operator_bad), Float64}, Float64, 1})
[...]
Unfortunately, the list of calls supported by ForwardDiff is too large to enumerate what is an isn't allowed, so the best advice is to try and see if it works.
Operator does not accept splatted input
The operator takes f(x::Vector)
as input, instead of the splatted f(x...)
.
julia> import ForwardDiff
julia> my_operator_bad(x::Vector) = sum(x[i]^2 for i in eachindex(x))
my_operator_bad (generic function with 1 method)
julia> my_operator_good(x...) = sum(x[i]^2 for i in eachindex(x))
my_operator_good (generic function with 1 method)
julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
ERROR: MethodError: no method matching my_operator_bad(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2})
[...]
julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
2-element Vector{Float64}:
2.0
4.0
Operator assumes Float64
as input
The operator assumes Float64
will be passed as input, but it must work for any generic Real
type.
julia> import ForwardDiff
julia> my_operator_bad(x::Float64...) = sum(x[i]^2 for i in eachindex(x))
my_operator_bad (generic function with 1 method)
julia> my_operator_good(x::Real...) = sum(x[i]^2 for i in eachindex(x))
my_operator_good (generic function with 1 method)
julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
ERROR: MethodError: no method matching my_operator_bad(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2})
[...]
julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
2-element Vector{Float64}:
2.0
4.0
Operator allocates Float64
storage
The operator allocates temporary storage using zeros(3)
or similar. This defaults to Float64
, so use zeros(T, 3)
instead.
julia> import ForwardDiff
julia> function my_operator_bad(x::Real...)
# This line is problematic. zeros(n) is short for zeros(Float64, n)
y = zeros(length(x))
for i in eachindex(x)
y[i] = x[i]^2
end
return sum(y)
end
my_operator_bad (generic function with 1 method)
julia> function my_operator_good(x::T...) where {T<:Real}
y = zeros(T, length(x))
for i in eachindex(x)
y[i] = x[i]^2
end
return sum(y)
end
my_operator_good (generic function with 1 method)
julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#1#2", Float64}, Float64, 2})
[...]
julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
2-element Vector{Float64}:
2.0
4.0