Tips and Tricks
This tutorial was generated using Literate.jl. Download the source as a .jl
file.
This tutorial was originally contributed by Arpit Bhatia.
This tutorial is aimed at providing a simplistic introduction to conic programming using JuMP.
It uses the following packages:
using JuMP
import SCS
import LinearAlgebra
This tutorial uses sets from MathOptInterface. By default, JuMP exports the MOI
symbol as an alias for the MathOptInterface.jl package. We recommend making this more explicit in your code by adding the following lines:
import MathOptInterface as MOI
A good resource for learning more about functions which can be modeled using cones is the MOSEK Modeling Cookbook.
Background theory
A subset $C$ of a vector space $V$ is a cone if $\forall x \in C$ and positive scalars $\lambda > 0$, the product $\lambda x \in C$.
A cone $C$ is a convex cone if $\lambda x + (1 - \lambda) y \in C$, for any $\lambda \in [0, 1]$, and any $x, y \in C$.
Conic programming problems are convex optimization problems in which a convex function is minimized over the intersection of an affine subspace and a convex cone. An example of a conic-form minimization problems, in the primal form is:
\[\begin{aligned} & \min_{x \in \mathbb{R}^n} & a_0^T x + b_0 \\ & \;\;\text{s.t.} & A_i x + b_i & \in \mathcal{C}_i & i = 1 \ldots m \end{aligned}\]
The corresponding dual problem is:
\[\begin{aligned} & \max_{y_1, \ldots, y_m} & -\sum_{i=1}^m b_i^T y_i + b_0 \\ & \;\;\text{s.t.} & a_0 - \sum_{i=1}^m A_i^T y_i & = 0 \\ & & y_i & \in \mathcal{C}_i^* & i = 1 \ldots m \end{aligned}\]
where each $\mathcal{C}_i$ is a closed convex cone and $\mathcal{C}_i^*$ is its dual cone.
Second-Order Cone
The SecondOrderCone
(or Lorentz Cone) of dimension $n$ is a cone of the form:
\[K_{soc} = \{ (t, x) \in \mathbb{R}^n : t \ge ||x||_2 \}\]
It is most commonly used to represent the L2-norm of the vector $x$:
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, x[1:3])
@variable(model, t)
@constraint(model, sum(x) == 1)
@constraint(model, [t; x] in SecondOrderCone())
@objective(model, Min, t)
optimize!(model)
value(t), value.(x)
(0.5773503148525415, [0.3333333333638126, 0.3333333333638124, 0.3333333333638124])
Rotated Second-Order Cone
A Second-Order Cone rotated by $\pi/4$ in the $(x_1,x_2)$ plane is called a RotatedSecondOrderCone
. It is a cone of the form:
\[K_{rsoc} = \{ (t,u,x) \in \mathbb{R}^n : 2tu \ge ||x||_2^2, t,u \ge 0 \}\]
When u = 0.5
, it represents the sum of squares of a vector $x$:
data = [1.0, 2.0, 3.0, 4.0]
target = [0.45, 1.04, 1.51, 1.97]
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, θ)
@variable(model, t)
@expression(model, residuals, θ * data .- target)
@constraint(model, [t; 0.5; residuals] in RotatedSecondOrderCone())
@objective(model, Min, t)
optimize!(model)
value(θ), value(t)
(0.49800000000000544, 0.004979422159618532)
Exponential Cone
The MOI.ExponentialCone
is a set of the form:
\[K_{exp} = \{ (x,y,z) \in \mathbb{R}^3 : y \exp (x/y) \le z, y > 0 \}\]
It can be used to model problems involving log
and exp
.
Exponential
To model $\exp(x) \le z$, use (x, 1, z)
:
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, x == 1.5)
@variable(model, z)
@objective(model, Min, z)
@constraint(model, [x, 1, z] in MOI.ExponentialCone())
optimize!(model)
value(z), exp(1.5)
(4.481654619603659, 4.4816890703380645)
Logarithm
To model $x \le \log(z)$, use (x, 1, z)
:
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, x)
@variable(model, z == 1.5)
@objective(model, Max, x)
@constraint(model, [x, 1, z] in MOI.ExponentialCone())
optimize!(model)
value(x), log(1.5)
(0.4055956586655673, 0.4054651081081644)
Log-sum-exp
To model $t \ge \log\left(\sum e^{x_i}\right)$, use:
N = 3
x0 = rand(N)
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, x[i = 1:N] == x0[i])
@variable(model, t)
@objective(model, Min, t)
@variable(model, u[1:N])
@constraint(model, sum(u) <= 1)
@constraint(model, [i = 1:N], [x[i] - t, 1, u[i]] in MOI.ExponentialCone())
optimize!(model)
value(t), log(sum(exp.(x0)))
(1.4726849472138905, 1.472772274699489)
Entropy
The entropy maximization problem consists of maximizing the entropy function, $H(x) = -x\log{x}$ subject to linear inequality constraints.
\[\begin{aligned} & \max & - \sum_{i=1}^n x_i \log x_i \\ & \;\;\text{s.t.} & \mathbf{1}^\top x = 1 \\ & & Ax \leq b \end{aligned}\]
We can model this problem using an exponential cone by using the following transformation:
\[t\leq -x\log{x} \iff t\leq x\log(1/x) \iff (t, x, 1) \in K_{exp}\]
Thus, our problem becomes,
\[\begin{aligned} & \max & 1^Tt \\ & \;\;\text{s.t.} & Ax \leq b \\ & & 1^T x = 1 \\ & & (t_i, x_i, 1) \in K_{exp} && \forall i = 1 \ldots n \\ \end{aligned}\]
m, n = 10, 15
A, b = randn(m, n), rand(m, 1)
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, t[1:n])
@variable(model, x[1:n])
@objective(model, Max, sum(t))
@constraint(model, sum(x) == 1)
@constraint(model, A * x .<= b)
@constraint(model, [i = 1:n], [t[i], x[i], 1] in MOI.ExponentialCone())
optimize!(model)
objective_value(model)
2.6632061721520914
The MOI.ExponentialCone
has a dual, the MOI.DualExponentialCone
, that offers an alternative formulation that can be more efficient for some formulations.
There is also the MOI.RelativeEntropyCone
for explicitly encoding the relative entropy function
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, t)
@variable(model, x[1:n])
@objective(model, Max, -t)
@constraint(model, sum(x) == 1)
@constraint(model, A * x .<= b)
@constraint(model, [t; ones(n); x] in MOI.RelativeEntropyCone(2n + 1))
optimize!(model)
objective_value(model)
2.663173643175568
PowerCone
The MOI.PowerCone
is a three-dimensional set parameterized by a scalar value α
. It has the form:
\[K_{p} = \{ (x,y,z) \in \mathbb{R}^3 : x^{\alpha} y^{1-\alpha} \ge |z|, x \ge 0, y \ge 0 \}\]
The power cone permits a number of reformulations. For example, when $p > 1$, we can model $t \ge x^p$ using the power cone $(t, 1, x)$ with $\alpha = 1 / p$. Thus, to model $t \ge x^3$ with $x \ge 0$
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, t)
@variable(model, x >= 1.5)
@constraint(model, [t, 1, x] in MOI.PowerCone(1 / 3))
@objective(model, Min, t)
optimize!(model)
value(t), value(x)
(3.3747548357745134, 1.4999552263154623)
The MOI.PowerCone
has a dual, the MOI.DualPowerCone
, that offers an alternative formulation that can be more efficient for some formulations.
P-Norm
The p-norm $||x||_p = \left(\sum\limits_{i} |x_i|^p\right)^{\frac{1}{p}}$ can be modeled using MOI.PowerCone
s. See the Mosek Modeling Cookbook for the derivation.
function p_norm(x::Vector, p)
N = length(x)
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, r[1:N])
@variable(model, t)
@constraint(model, [i = 1:N], [r[i], t, x[i]] in MOI.PowerCone(1 / p))
@constraint(model, sum(r) == t)
@objective(model, Min, t)
optimize!(model)
return value(t)
end
x = rand(5);
LinearAlgebra.norm(x, 4), p_norm(x, 4)
(0.9316922467512209, 0.9316875994951573)
Positive Semidefinite Cone
The set of positive semidefinite matrices (PSD) of dimension $n$ form a cone in $\mathbb{R}^n$. We write this set mathematically as:
\[\mathcal{S}_{+}^n = \{ X \in \mathcal{S}^n \mid z^T X z \geq 0, \: \forall z\in \mathbb{R}^n \}.\]
A PSD cone is represented in JuMP using the MOI sets PositiveSemidefiniteConeTriangle
(for upper triangle of a PSD matrix) and PositiveSemidefiniteConeSquare
(for a complete PSD matrix). However, it is preferable to use the PSDCone
shortcut as illustrated below.
Example: largest eigenvalue of a symmetric matrix
Suppose $A$ has eigenvalues $\lambda_{1} \geq \lambda_{2} \ldots \geq \lambda_{n}$. Then the matrix $t I-A$ has eigenvalues $t-\lambda_{1}, t-\lambda_{2}, \ldots, t-\lambda_{n}$. Note that $t I-A$ is PSD exactly when all these eigenvalues are non-negative, and this happens for values $t \geq \lambda_{1}$. Thus, we can model the problem of finding the largest eigenvalue of a symmetric matrix as:
\[\begin{aligned} \lambda_{1} = \min t \\ \text { s.t. } t I-A \succeq 0 \end{aligned}\]
A = [3 2 4; 2 0 2; 4 2 3]
I = Matrix{Float64}(LinearAlgebra.I, 3, 3)
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, t)
@objective(model, Min, t)
@constraint(model, t .* I - A in PSDCone())
optimize!(model)
objective_value(model)
8.000003377698668
GeometricMeanCone
The MOI.GeometricMeanCone
is a cone of the form:
\[K_{geo} = \{ (t, x) \in \mathbb{R}^n : x \ge 0, t \le \sqrt[n-1]{x_1 x_2 \cdots x_{n-1}} \}\]
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, x[1:4])
@variable(model, t)
@constraint(model, sum(x) == 1)
@constraint(model, [t; x] in MOI.GeometricMeanCone(5))
optimize!(model)
value(t), value.(x)
(0.0, [0.25000000110304893, 0.250000001103052, 0.25000000110305065, 0.2500000011030498])
RootDetCone
The MOI.RootDetConeSquare
is a cone of the form:
\[K = \{ (t, X) \in \mathbb{R}^{1+d^2} : t \le \det(X)^{\frac{1}{d}} \}\]
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, t)
@variable(model, X[1:2, 1:2])
@objective(model, Max, t)
@constraint(model, [t; vec(X)] in MOI.RootDetConeSquare(2))
@constraint(model, X .== [2 1; 1 3])
optimize!(model)
value(t), sqrt(LinearAlgebra.det(value.(X)))
(2.236116364743911, 2.2360679574027222)
If X
is symmetric, then you can use MOI.RootDetConeTriangle
instead. This can be more efficient because the solver does not need to add additional constraints to ensure X
is symmetric.
When forming the function, use triangle_vec
to obtain the column-wise upper triangle of the matrix as a vector in the order that JuMP requires.
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, t)
@variable(model, X[1:2, 1:2], Symmetric)
@objective(model, Max, t)
@constraint(model, [t; triangle_vec(X)] in MOI.RootDetConeTriangle(2))
@constraint(model, X .== [2 1; 1 3])
optimize!(model)
value(t), sqrt(LinearAlgebra.det(value.(X)))
(2.2361792479821463, 2.2360679863316992)
LogDetCone
The MOI.LogDetConeSquare
is a cone of the form:
\[K = \{ (t, u, X) \in \mathbb{R}^{2+d^2} : t \le u \log(\det(X / u)) \}\]
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, t)
@variable(model, u)
@variable(model, X[1:2, 1:2])
@objective(model, Max, t)
@constraint(model, [t; u; vec(X)] in MOI.LogDetConeSquare(2))
@constraint(model, X .== [2 1; 1 3])
@constraint(model, u == 0.5)
optimize!(model)
value(t), 0.5 * log(LinearAlgebra.det(value.(X) ./ 0.5))
(1.497920445822069, 1.4978661006407137)
If X
is symmetric, then you can use MOI.LogDetConeTriangle
instead. This can be more efficient because the solver does not need to add additional constraints to ensure X
is symmetric.
When forming the function, use triangle_vec
to obtain the column-wise upper triangle of the matrix as a vector in the order that JuMP requires.
model = Model(SCS.Optimizer)
set_silent(model)
@variable(model, t)
@variable(model, u)
@variable(model, X[1:2, 1:2], Symmetric)
@objective(model, Max, t)
@constraint(model, [t; u; triangle_vec(X)] in MOI.LogDetConeTriangle(2))
@constraint(model, X .== [2 1; 1 3])
@constraint(model, u == 0.5)
optimize!(model)
value(t), 0.5 * log(LinearAlgebra.det(value.(X) ./ 0.5))
(1.4979204481049417, 1.497866101098124)
Other Cones and Functions
For other cones supported by JuMP, check out the MathOptInterface Manual.