Nonlinear Modeling

JuMP has support for nonlinear (convex and nonconvex) optimization problems. JuMP is able to automatically provide exact, sparse second-order derivatives to solvers. This information can improve solver accuracy and performance.

Set a nonlinear objective

Use @objective to set a nonlinear objective.

julia> model = Model();

julia> @variable(model, x[1:2]);

julia> @objective(model, Min, exp(x[1]) - sqrt(x[2]))
exp(x[1]) - sqrt(x[2])

To modify a nonlinear objective, call @objective again.

Add a nonlinear constraint

Use @constraint to add a nonlinear constraint.

julia> model = Model();

julia> @variable(model, x[1:2]);

julia> @constraint(model, exp(x[1]) <= 1)
exp(x[1]) - 1.0 ≤ 0

julia> @constraint(model, con[i = 1:2], 2^x[i] >= i)
2-element Vector{ConstraintRef{Model, MathOptInterface.ConstraintIndex{MathOptInterface.ScalarNonlinearFunction, MathOptInterface.GreaterThan{Float64}}, ScalarShape}}:
 con[1] : (2.0 ^ x[1]) - 1.0 ≥ 0
 con[2] : (2.0 ^ x[2]) - 2.0 ≥ 0

Delete a nonlinear constraint using delete:

julia> delete(model, con[1])

Add a parameter

Some solvers have explicit support for parameters, which are constants in the model that can be efficiently updated between solves.

JuMP implements parameters by a decision variable constrained on creation to the Parameter set.

julia> model = Model();

julia> @variable(model, x);

julia> @variable(model, p[i = 1:2] in Parameter(i))
2-element Vector{VariableRef}:
 p[1]
 p[2]

julia> parameter_value(p[1])
1.0

julia> set_parameter_value(p[1], 3.5)

julia> @objective(model, Max, log(p[1] * x + p[2]))
log(p[1]*x + p[2])

See Parameters for more information on how to create and manage parameters.

Parameters are most useful when solving nonlinear models in a sequence:

julia> using JuMP, Ipopt
julia> model = Model(Ipopt.Optimizer);
julia> set_silent(model)
julia> @variable(model, x)x
julia> @variable(model, p in Parameter(1.0))p
julia> @objective(model, Min, (x - p)^2)x² - 2 p*x + p²
julia> optimize!(model)
julia> value(x)1.0
julia> set_parameter_value(p, 5.0)
julia> optimize!(model)
julia> value(x)5.0

Using parameters can be faster than creating a new model from scratch with updated data because JuMP is able to avoid repeating a number of steps in processing the model before handing it off to the solver.

Create a nonlinear expression

Use @expression to create nonlinear expression objects:

julia> model = Model();

julia> @variable(model, x[1:2]);

julia> expr = @expression(model, exp(x[1]) + sqrt(x[2]))
exp(x[1]) + sqrt(x[2])

julia> my_anon_expr = @expression(model, [i = 1:2], sin(x[i]))
2-element Vector{NonlinearExpr}:
 sin(x[1])
 sin(x[2])

julia> @expression(model, my_expr[i = 1:2], sin(x[i]))
2-element Vector{NonlinearExpr}:
 sin(x[1])
 sin(x[2])

A NonlinearExpr can be used in @objective, @constraint, and even nested in other @expressions.

julia> @objective(model, Min, expr^2 + 1)
((exp(x[1]) + sqrt(x[2])) ^ 2.0) + 1.0

julia> @constraint(model, [i = 1:2], my_expr[i] <= i)
2-element Vector{ConstraintRef{Model, MathOptInterface.ConstraintIndex{MathOptInterface.ScalarNonlinearFunction, MathOptInterface.LessThan{Float64}}, ScalarShape}}:
 sin(x[1]) - 1.0 ≤ 0
 sin(x[2]) - 2.0 ≤ 0

julia> @expression(model, nested[i = 1:2], sin(my_expr[i]))
2-element Vector{NonlinearExpr}:
 sin(sin(x[1]))
 sin(sin(x[2]))

Use value to query the value of a nonlinear expression:

julia> set_start_value(x[1], 1.0)

julia> value(start_value, nested[1])
0.7456241416655579

julia> sin(sin(1.0))
0.7456241416655579

Automatic differentiation

JuMP computes first- and second-order derivatives using sparse reverse-mode automatic differentiation. For details, see ReverseAD.

For a tutorial on how to construct and query the derivatives, see Computing Hessians

Nonlinear expressions in detail

Nonlinear expressions in JuMP are represented by a NonlinearExpr object.

Constructors

Nonlinear expressions can be created using the NonlinearExpr constructors:

julia> model = Model();

julia> @variable(model, x);

julia> expr = NonlinearExpr(:sin, Any[x])
sin(x)

or via operator overloading:

julia> model = Model();

julia> @variable(model, x);

julia> expr = sin(x)
sin(x)

Supported arguments

Nonlinear expressions can contain a mix of numbers, AffExpr, QuadExpr, and other NonlinearExpr:

julia> model = Model();

julia> @variable(model, x);

julia> aff = x + 1;

julia> quad = x^2 + x;

julia> expr = cos(x) * sin(quad) + aff
(cos(x) * sin(x² + x)) + (x + 1)

Supported operators

The list of supported operators may vary between solvers. Given an optimizer, query the list of supported operators using MOI.ListOfSupportedNonlinearOperators:

julia> import Ipopt

julia> import MathOptInterface as MOI

julia> MOI.get(Ipopt.Optimizer(), MOI.ListOfSupportedNonlinearOperators())
85-element Vector{Symbol}:
 :+
 :-
 :abs
 :sqrt
 :cbrt
 :abs2
 :inv
 :log
 :log10
 :log2
 ⋮
 :min
 :max
 :&&
 :||
 :<=
 :(==)
 :>=
 :<
 :>

In some univariate cases, the operator is defined in SpecialFunctions.jl. To use these functions, you must explicitly import SpecialFunctions.jl

julia> import Ipopt

julia> op = MOI.get(Ipopt.Optimizer(), MOI.ListOfSupportedNonlinearOperators());

julia> :erfcx in op
true

julia> :dawson in op
true

julia> import SpecialFunctions

julia> model = Model();

julia> @variable(model, x)
x

julia> @expression(model, SpecialFunctions.erfcx(x))
erfcx(x)

julia> @expression(model, SpecialFunctions.dawson(x))
dawson(x)

Limitations

Some nonlinear expressions cannot be created via operator overloading. For example, to minimize the likelihood of bugs in user-code, we have not overloaded comparisons such as < and >= between JuMP objects:

julia> model = Model();

julia> @variable(model, x);

julia> x < 1
ERROR: Cannot evaluate `<` between a variable and a number.
[...]

Instead, wrap the expression in the @expression macro:

julia> model = Model();

julia> @variable(model, x);

julia> expr = @expression(model, x < 1)
x < 1

For technical reasons, other operators that are not overloaded include ||, &&, and ifelse.

julia> model = Model();

julia> @variable(model, x);

julia> expr = @expression(model, ifelse(x < -1 || x >= 1, x^2, 0.0))
ifelse((x < -1) || (x >= 1), x², 0.0)

As an alternative, use the JuMP.op_ functions, which fallback to the various comparison and logical operators:

julia> model = Model();

julia> @variable(model, x);

julia> expr = op_ifelse(
           op_or(op_strictly_less_than(x, -1), op_greater_than_or_equal_to(x, 1)),
           x^2,
           0.0,
       )
ifelse((x < -1) || (x >= 1), x², 0.0)

The available functions are:

JuMP functionJulia function
op_ifelseifelse
op_and&&
op_or||
op_greater_than_or_equal_to>=
op_less_than_or_equal_to<=
op_equal_to==
op_strictly_greater_than>
op_strictly_less_than<

Fields

Each NonlinearExpr has two fields.

The .head field is a Symbol that represents the operator being called:

julia> expr.head
:sin

The .args field is a Vector{Any} containing the arguments to the operator:

julia> expr.args
1-element Vector{Any}:
 x

Function tracing

Nonlinear expressions can be constructed using function tracing. Function tracing is when you call a regular Julia function with JuMP variables as arguments and the function builds a nonlinear expression via operator overloading. For example:

julia> using JuMP
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> f(x::Vector{VariableRef}) = 2 * sin(x[1]^2) + sqrt(x[2])f (generic function with 1 method)
julia> y = f(x)(2.0 * sin(x[1]²)) + sqrt(x[2])
julia> typeof(y)NonlinearExpr (alias for GenericNonlinearExpr{GenericVariableRef{Float64}})
julia> @objective(model, Max, f(x))(2.0 * sin(x[1]²)) + sqrt(x[2])

Function tracing supports functions which return vectors or arrays of NonlinearExpr:

julia> using JuMP
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> f(x::Vector{VariableRef}) = sqrt.(x)f (generic function with 1 method)
julia> y = f(x)2-element Vector{NonlinearExpr}: sqrt(x[1]) sqrt(x[2])
julia> typeof(y)Vector{NonlinearExpr} (alias for Array{GenericNonlinearExpr{GenericVariableRef{Float64}}, 1})
julia> @constraint(model, f(x) .<= 2)2-element Vector{ConstraintRef{Model, MathOptInterface.ConstraintIndex{MathOptInterface.ScalarNonlinearFunction, MathOptInterface.LessThan{Float64}}, ScalarShape}}: sqrt(x[1]) - 2.0 ≤ 0 sqrt(x[2]) - 2.0 ≤ 0
julia> @objective(model, Max, sum(f(x)))0.0 + sqrt(x[2]) + sqrt(x[1])

Because function tracing uses operator overloading, there are many functions for which it will not work. For example:

julia> using JuMP

julia> model = Model();

julia> @variable(model, x[1:2]);

julia> f(x::Vector{VariableRef}) = x[1] > 1 ? 0 : x[2]
f (generic function with 1 method)

julia> f(x)
ERROR: Cannot evaluate `>` between a variable and a number.
[...]

In these cases, you should define a User-defined operator using the @operator macro.

User-defined operators

In addition to a standard list of univariate and multivariate operators recognized by the MOI.Nonlinear submodule, JuMP supports user-defined operators, which let you represent nonlinear functions that cannot (or should not) be traced, for example, because they rely on non-Julia subroutines.

Warning

User-defined operators must return a scalar output. For a work-around, see User-defined operators with vector outputs.

Add an operator

Add a user-defined operator using the @operator macro:

julia> using JuMP
julia> square(x) = x^2square (generic function with 1 method)
julia> f(x, y) = (x - 1)^2 + (y - 2)^2f (generic function with 1 method)
julia> model = Model();
julia> @operator(model, op_square, 1, square)NonlinearOperator(square, :op_square)
julia> @operator(model, op_f, 2, f)NonlinearOperator(f, :op_f)
julia> @variable(model, x[1:2]);
julia> @objective(model, Min, op_f(x[1], op_square(x[2])))op_f(x[1], op_square(x[2]))

The arguments to @operator are:

  1. The model to which the operator is added.
  2. A Julia symbol object which serves as the name of the user-defined operator in JuMP expressions. This name must not be the same as that of the function.
  3. The number of scalar input arguments that the function takes.
  4. A Julia method which computes the function.
Warning

User-defined operators cannot be deleted.

You can obtain a reference to the operator using the model[:key] syntax:

julia> using JuMP
julia> square(x) = x^2square (generic function with 1 method)
julia> model = Model();
julia> @operator(model, op_square, 1, square)NonlinearOperator(square, :op_square)
julia> op_square_2 = model[:op_square]NonlinearOperator(square, :op_square)

Automatic differentiation

JuMP computes first- and second-order derivatives of expressions using ReverseAD, which implements sparse reverse-mode automatic differentiation. However, because ReverseAD requires the algebraic expression as input, JuMP cannot use ReverseAD to differentiate user-defined operators.

Instead, unless Gradients and Hessians are explicitly provided, user-defined operators must support automatic differentiation by ForwardDiff.jl.

The use of FowardDiff.jl has two important implications:

  1. ForwardDiff.jl supports only a limited subset of Julia. If you encounter an error adding the operator, see Common mistakes when writing a user-defined operator.
  2. Differentiating operators with many arguments is slow. In general, you should try to keep the number of arguments to less than 100, and ideally, to less than 10.

Because of the use of ForwardDiff, in most cases, you should prefer to use function tracing instead of defining a user-defined operator.

Add an operator without macros

The @operator macro is syntactic sugar for add_nonlinear_operator. Thus, the non-macro version of the preceding example is:

julia> using JuMP
julia> square(x) = x^2square (generic function with 1 method)
julia> f(x, y) = (x - 1)^2 + (y - 2)^2f (generic function with 1 method)
julia> model = Model();
julia> op_square = add_nonlinear_operator(model, 1, square; name = :op_square)NonlinearOperator(square, :op_square)
julia> model[:op_square] = op_squareNonlinearOperator(square, :op_square)
julia> op_f = add_nonlinear_operator(model, 2, f; name = :op_f)NonlinearOperator(f, :op_f)
julia> model[:op_f] = op_fNonlinearOperator(f, :op_f)
julia> @variable(model, x[1:2]);
julia> @objective(model, Min, op_f(x[1], op_square(x[2])))op_f(x[1], op_square(x[2]))

Operators with the same name as an existing function

A common error encountered is the following:

julia> using JuMP

julia> model = Model();

julia> f(x) = x^2
f (generic function with 1 method)

julia> @operator(model, f, 1, f)
ERROR: Unable to add the nonlinear operator `:f` with the same name as
an existing function.
[...]

This error occurs because @operator(model, f, 1, f) is equivalent to:

julia> f = add_nonlinear_operator(model, 1, f; name = :f)

but f already exists as a Julia function.

If you evaluate the function without adding it as an operator, JuMP will trace the function using operator overloading:

julia> @variable(model, x);

julia> f(x)
x²

To force JuMP to treat f as a user-defined operator and not trace it, add the operator using add_nonlinear_operator and define a new method which manually creates a NonlinearExpr:

julia> _ = add_nonlinear_operator(model, 1, f; name = :f)
NonlinearOperator(f, :f)

julia> f(x::AbstractJuMPScalar) = NonlinearExpr(:f, Any[x])
f (generic function with 2 methods)

julia> @expression(model, log(f(x)))
log(f(x))

Gradients and Hessians

By default, JuMP will use automatic differentiation to compute the gradient and Hessian of user-defined operators. If your function is not amenable to the default automatic differentiation, or you can compute analytic derivatives, you may pass additional arguments to @operator to compute the first- and second-derivatives.

Tip

The tutorial Automatic differentiation of user-defined operators has examples of how to use third-party Julia packages to compute automatic derivatives.

Univariate functions

For univariate functions, a gradient function ∇f returns a number that represents the first-order derivative. You may, in addition, pass a third function which returns a number representing the second-order derivative:

julia> using JuMP
julia> f(x) = x^2f (generic function with 1 method)
julia> ∇f(x) = 2x∇f (generic function with 1 method)
julia> ∇²f(x) = 2∇²f (generic function with 1 method)
julia> model = Model();
julia> @operator(model, op_f, 1, f, ∇f, ∇²f) # Providing ∇²f is optionalNonlinearOperator(f, :op_f)
julia> @variable(model, x)x
julia> @objective(model, Min, op_f(x))op_f(x)

Multivariate functions

For multivariate functions, the gradient function ∇f must take an AbstractVector as the first argument that is filled in-place. The Hessian function, ∇²f, must take an AbstractMatrix as the first argument, the lower-triangular of which is filled in-place:

julia> using JuMP
julia> f(x...) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2f (generic function with 1 method)
julia> function ∇f(g::AbstractVector{T}, x::T...) where {T} g[1] = 400 * x[1]^3 - 400 * x[1] * x[2] + 2 * x[1] - 2 g[2] = 200 * (x[2] - x[1]^2) return end∇f (generic function with 1 method)
julia> function ∇²f(H::AbstractMatrix{T}, x::T...) where {T} H[1, 1] = 1200 * x[1]^2 - 400 * x[2] + 2 # H[1, 2] = -400 * x[1] <-- Not needed. Fill the lower-triangular only. H[2, 1] = -400 * x[1] H[2, 2] = 200.0 return end∇²f (generic function with 1 method)
julia> model = Model();
julia> @operator(model, rosenbrock, 2, f, ∇f, ∇²f) # Providing ∇²f is optionalNonlinearOperator(f, :rosenbrock)
julia> @variable(model, x[1:2])2-element Vector{VariableRef}: x[1] x[2]
julia> @objective(model, Min, rosenbrock(x[1], x[2]))rosenbrock(x[1], x[2])

You may assume the Hessian matrix H is initialized with zeros, and because H is symmetric, you need only to fill in the non-zero lower-triangular terms. The matrix type passed in as H depends on the automatic differentiation system, so make sure the first argument to the Hessian function supports an AbstractMatrix (it may be something other than Matrix{Float64}). Moreover, you may assume only that H supports size(H) and setindex!. Finally, the matrix is treated as dense, so the performance will be poor on functions with high-dimensional input.

User-defined operators with vector inputs

User-defined operators which take vectors as input arguments (for example, f(x::Vector)) are not supported. Instead, use Julia's splatting syntax to create a function with scalar arguments. For example, instead of:

f(x::Vector) = sum(x[i]^i for i in 1:length(x))

define:

f(x...) = sum(x[i]^i for i in 1:length(x))

Another approach is to define the splatted function as an anonymous function:

julia> using JuMP
julia> model = Model();
julia> @variable(model, x[1:5])5-element Vector{VariableRef}: x[1] x[2] x[3] x[4] x[5]
julia> f(x::Vector) = sum(x[i]^i for i in 1:length(x))f (generic function with 1 method)
julia> @operator(model, op_f, 5, (x...) -> f(collect(x)))NonlinearOperator(#6, :op_f)
julia> @objective(model, Min, op_f(x...))op_f(x[1], x[2], x[3], x[4], x[5])

If the operator takes several vector inputs, write a function that takes the splatted arguments and reconstructs the required vector inputs:

julia> using JuMP
julia> model = Model();
julia> @variable(model, x[1:2]);
julia> @variable(model, y[1:2]);
julia> @variable(model, z);
julia> f(x::Vector, y::Vector, z) = sum((x[i] * y[i])^z for i in 1:2)f (generic function with 1 method)
julia> f(x, y, z)((x[1]*y[1]) ^ z) + ((x[2]*y[2]) ^ z)
julia> f_splat(args...) = f(collect(args[1:2]), collect(args[3:4]), args[5])f_splat (generic function with 1 method)
julia> f_splat(x..., y..., z)((x[1]*y[1]) ^ z) + ((x[2]*y[2]) ^ z)
julia> @operator(model, op_f, 5, f_splat)NonlinearOperator(f_splat, :op_f)
julia> @objective(model, Min, op_f(x..., y..., z))op_f(x[1], x[2], y[1], y[2], z)

Common mistakes when writing a user-defined operator

JuMP uses ForwardDiff.jl to compute the first-order derivatives of user-defined operators. ForwardDiff has a number of limitations that you should be aware of when writing user-defined operators.

The rest of this section provides debugging advice and explains some common mistakes.

Warning

Get an error like No method matching Float64(::ForwardDiff.Dual)? Read this section.

Debugging

If you add an operator that does not support ForwardDiff, a long error message will be printed. You can review the stacktrace for more information, but it can often be hard to understand why and where your function is failing.

It may be helpful to debug the operator outside of JuMP as follows.

If the operator is univariate, do:

julia> import ForwardDiff

julia> my_operator(a) = a^2
my_operator (generic function with 1 method)

julia> ForwardDiff.derivative(my_operator, 1.0)
2.0

If the operator is multivariate, do:

julia> import ForwardDiff

julia> my_operator(a, b) = a^2 + b^2
my_operator (generic function with 1 method)

julia> ForwardDiff.gradient(x -> my_operator(x...), [1.0, 2.0])
2-element Vector{Float64}:
 2.0
 4.0

Note that even though the operator takes the splatted arguments, ForwardDiff.gradient requires a vector as input.

Operator calls something unsupported by ForwardDiff

ForwardDiff works by overloading many Julia functions for a special type ForwardDiff.Dual <: Real. If your operator attempts to call a function for which an overload has not been defined, a MethodError will be thrown.

For example, your operator cannot call external C functions, or be the optimal objective value of a JuMP model.

julia> import ForwardDiff

julia> my_operator_bad(x) = @ccall sqrt(x::Cdouble)::Cdouble
my_operator_bad (generic function with 1 method)

julia> ForwardDiff.derivative(my_operator_bad, 1.0)
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(my_operator_bad), Float64}, Float64, 1})
[...]

Unfortunately, the list of calls supported by ForwardDiff is too large to enumerate what is an isn't allowed, so the best advice is to try and see if it works.

Operator does not accept splatted input

The operator takes f(x::Vector) as input, instead of the splatted f(x...).

julia> import ForwardDiff

julia> my_operator_bad(x::Vector) = sum(x[i]^2 for i in eachindex(x))
my_operator_bad (generic function with 1 method)

julia> my_operator_good(x...) = sum(x[i]^2 for i in eachindex(x))
my_operator_good (generic function with 1 method)

julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
ERROR: MethodError: no method matching my_operator_bad(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2})
[...]

julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
2-element Vector{Float64}:
 2.0
 4.0

Operator assumes Float64 as input

The operator assumes Float64 will be passed as input, but it must work for any generic Real type.

julia> import ForwardDiff

julia> my_operator_bad(x::Float64...) = sum(x[i]^2 for i in eachindex(x))
my_operator_bad (generic function with 1 method)

julia> my_operator_good(x::Real...) = sum(x[i]^2 for i in eachindex(x))
my_operator_good (generic function with 1 method)

julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
ERROR: MethodError: no method matching my_operator_bad(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 2})
[...]

julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
2-element Vector{Float64}:
 2.0
 4.0

Operator allocates Float64 storage

The operator allocates temporary storage using zeros(3) or similar. This defaults to Float64, so use zeros(T, 3) instead.

julia> import ForwardDiff

julia> function my_operator_bad(x::Real...)
           # This line is problematic. zeros(n) is short for zeros(Float64, n)
           y = zeros(length(x))
           for i in eachindex(x)
               y[i] = x[i]^2
           end
           return sum(y)
       end
my_operator_bad (generic function with 1 method)

julia> function my_operator_good(x::T...) where {T<:Real}
           y = zeros(T, length(x))
           for i in eachindex(x)
               y[i] = x[i]^2
           end
           return sum(y)
       end
my_operator_good (generic function with 1 method)

julia> ForwardDiff.gradient(x -> my_operator_bad(x...), [1.0, 2.0])
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#1#2", Float64}, Float64, 2})
[...]

julia> ForwardDiff.gradient(x -> my_operator_good(x...), [1.0, 2.0])
2-element Vector{Float64}:
 2.0
 4.0