Dualization

This tutorial was generated using Literate.jl. Download the source as a .jl file.

The purpose of this tutorial is to explain how to use Dualization.jl to improve the performance of some conic optimization models.

There are two important takeaways:

JuMP reformulates problems to meet the input requirements of the solver, potentially increasing the problem size by adding slack variables and constraints.
Solving the dual of a conic model can be more efficient than solving the primal.

Dualization.jl is a package which fixes these problems, allowing you to solve the dual instead of the primal with a one-line change to your code.

Required packages

This tutorial uses the following packages:

using JuMP
import Dualization
import LinearAlgebra
import SCS

Background

Conic optimization solvers typically accept one of two input formulations.

The first is the standard conic form:

\[\begin{align} \min_{x \in \mathbb{R}^n} \; & c^\top x \\ \;\;\text{s.t.} \; & A x = b \\ & x \in \mathcal{K} \end{align}\]

in which we have a set of linear equality constraints $Ax = b$ and the variables belong to a cone $\mathcal{K}$.

The second is the geometric conic form:

\[\begin{align} \min_{x \in \mathbb{R}^n} \; & c^\top x \\ \;\;\text{s.t.} \; & A x - b \in \mathcal{K} \end{align}\]

in which an affine function $Ax - b$ belongs to a cone $\mathcal{K}$ and the variables are free.

It is trivial to convert between these two representations, for example, to go from the geometric conic form to the standard conic form we introduce slack variables $y$:

\[\begin{align} \min_{x \in \mathbb{R}^n} \; & c^\top x \\ \;\;\text{s.t.} \; & \begin{bmatrix}A & -I\end{bmatrix} \begin{bmatrix}x\\y\end{bmatrix} = b \\ & \begin{bmatrix}x\\y\end{bmatrix} \in \mathbb{R}^n \times \mathcal{K} \end{align}\]

and to go from the standard conic form to the geometric conic form, we can rewrite the equality constraint as a function belonging to the {0} cone:

\[\begin{align} \min_{x \in \mathbb{R}^n} \; & c^\top x \\ \;\;\text{s.t.} & \begin{bmatrix}A\\I\end{bmatrix} x - \begin{bmatrix}b\\0\end{bmatrix} \in \{0\} \times \mathcal{K} \end{align}\]

From a theoretical perspective, the two formulations are equivalent, and if you implement a model in the standard conic form and pass it to a geometric conic form solver (or vice versa), then JuMP will automatically reformulate the problem into the correct formulation.

From a practical perspective though, the reformulations are problematic because the additional slack variables and constraints can make the problem much larger and therefore harder to solve.

You should also note many problems contain a mix of conic constraints and variables, and so they do not neatly fall into one of the two formulations. In these cases, JuMP reformulates only the variables and constraints as necessary to convert the problem into the desired form.

Primal and dual formulations

Duality plays a large role in conic optimization. For a detailed description of conic duality, see Duality.

A useful observation is that if the primal problem is in standard conic form, then the dual problem is in geometric conic form, and vice versa. Moreover, the primal and dual may have a different number of variables and constraints, although which one is smaller depends on the problem. Therefore, instead of reformulating the problem from one form to the other, it can be more efficient to solve the dual instead of the primal.

To demonstrate, we use a variation of the Maximum cut via SDP example.

The primal formulation (in standard conic form) is:

model_primal = Model()
@variable(model_primal, X[1:2, 1:2], PSD)
@objective(model_primal, Max, sum([1 -1; -1 1] .* X))
@constraint(model_primal, primal_c[i = 1:2], 1 - X[i, i] == 0)
print(model_primal)

Max X[1,1] - 2 X[1,2] + X[2,2]
Subject to
 primal_c[1] : -X[1,1] = -1
 primal_c[2] : -X[2,2] = -1
 [X[1,1]  X[1,2]
  ⋯       X[2,2]] ∈ PSDCone()

This problem has three scalar decision variables (the matrix X is symmetric), two scalar equality constraints, and a constraint that X is positive semidefinite.

The dual of model_primal is:

model_dual = Model()
@variable(model_dual, y[1:2])
@objective(model_dual, Min, sum(y))
@constraint(
    model_dual,
    dual_c,
    LinearAlgebra.Symmetric([y[1]-1 1; 1 y[2]-1]) in PSDCone(),
)
print(model_dual)

Min y[1] + y[2]
Subject to
 dual_c : [y[1] - 1  1
  ⋯         y[2] - 1] ∈ PSDCone()

This problem has two scalar decision variables, and a 2x2 positive semidefinite matrix constraint.

Tip

If you haven't seen conic duality before, try deriving the dual problem based on the description in Duality. You'll need to know that the dual cone of PSDCone is the PSDCone.

When we solve model_primal with SCS.Optimizer, SCS reports three variables (variables n: 3), five rows in the constraint matrix (constraints m: 5), and five non-zeros in the matrix (nnz(A): 5):

set_optimizer(model_primal, SCS.Optimizer)
optimize!(model_primal)
assert_is_solved_and_feasible(model_primal; dual = true)

------------------------------------------------------------------
	       SCS v3.2.7 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 3, constraints m: 5
cones: 	  z: primal zero / dual free vars: 2
	  s: psd vars: 3, ssize: 1
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
	  alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
	  max_iters: 100000, normalize: 1, rho_x: 1.00e-06
	  acceleration_lookback: 10, acceleration_interval: 10
	  compiled with openmp parallelization enabled
lin-sys:  sparse-direct-amd-qdldl
	  nnz(A): 5, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 1.65e+01  1.60e-01  5.09e+01 -2.91e+01  1.00e-01  1.04e-04
    50| 1.74e-08  2.70e-10  4.88e-08 -4.00e+00  1.00e-01  1.84e-04
------------------------------------------------------------------
status:  solved
timings: total: 1.85e-04s = setup: 4.48e-05s + solve: 1.40e-04s
	 lin-sys: 1.17e-05s, cones: 5.99e-05s, accel: 2.63e-06s
------------------------------------------------------------------
objective = -4.000000
------------------------------------------------------------------

(There are five rows in the constraint matrix because SCS expects problems in geometric conic form, and so JuMP has reformulated the X, PSD variable constraint into the affine constraint X .+ 0 in PSDCone().)

The solution we obtain is:

value(X)

2×2 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
  1.0  -1.0
 -1.0   1.0

dual(VariableInSetRef(X))

2×2 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
 1.0  1.0
 1.0  1.0

dual.(primal_c)

2-element Vector{Float64}:
 1.9999999997299085
 1.9999999997299085

objective_value(model_primal)

3.9999999506359716

When we solve model_dual with SCS.Optimizer, SCS reports two variables (variables n: 2), three rows in the constraint matrix (constraints m: 3), and two non-zeros in the matrix (nnz(A): 2):

set_optimizer(model_dual, SCS.Optimizer)
optimize!(model_dual)
assert_is_solved_and_feasible(model_dual; dual = true)

------------------------------------------------------------------
	       SCS v3.2.7 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 2, constraints m: 3
cones: 	  s: psd vars: 3, ssize: 1
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
	  alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
	  max_iters: 100000, normalize: 1, rho_x: 1.00e-06
	  acceleration_lookback: 10, acceleration_interval: 10
	  compiled with openmp parallelization enabled
lin-sys:  sparse-direct-amd-qdldl
	  nnz(A): 2, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 1.23e+01  1.00e+00  2.73e+01 -9.03e+00  1.00e-01  9.55e-05
    50| 1.13e-07  1.05e-09  3.23e-07  4.00e+00  1.00e-01  1.75e-04
------------------------------------------------------------------
status:  solved
timings: total: 1.76e-04s = setup: 4.60e-05s + solve: 1.30e-04s
	 lin-sys: 9.90e-06s, cones: 5.74e-05s, accel: 2.64e-06s
------------------------------------------------------------------
objective = 4.000000
------------------------------------------------------------------

and the solution we obtain is:

dual(dual_c)

2×2 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
  1.0  -1.0
 -1.0   1.0

value(dual_c)

2×2 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
 1.0  1.0
 1.0  1.0

value.(y)

2-element Vector{Float64}:
 2.000000159272681
 2.000000159272681

objective_value(model_dual)

4.000000318545362

This particular problem is small enough that it isn't meaningful to compare the solve times, but in general, we should expect model_dual to solve faster than model_primal because it contains fewer variables and constraints. The difference is particularly noticeable on large-scale optimization problems.

`dual_optimizer`

Manually deriving the conic dual is difficult and error-prone. The package Dualization.jl provides the Dualization.dual_optimizer meta-solver, which wraps any MathOptInterface-compatible solver in an interface that automatically formulates and solves the dual of an input problem.

To demonstrate, we use Dualization.dual_optimizer to solve model_primal:

set_optimizer(model_primal, Dualization.dual_optimizer(SCS.Optimizer))
optimize!(model_primal)
assert_is_solved_and_feasible(model_primal; dual = true)

------------------------------------------------------------------
	       SCS v3.2.7 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 2, constraints m: 3
cones: 	  s: psd vars: 3, ssize: 1
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
	  alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
	  max_iters: 100000, normalize: 1, rho_x: 1.00e-06
	  acceleration_lookback: 10, acceleration_interval: 10
	  compiled with openmp parallelization enabled
lin-sys:  sparse-direct-amd-qdldl
	  nnz(A): 2, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 1.23e+01  1.00e+00  2.73e+01 -9.03e+00  1.00e-01  1.19e-04
    50| 1.13e-07  1.05e-09  3.23e-07  4.00e+00  1.00e-01  1.99e-04
------------------------------------------------------------------
status:  solved
timings: total: 1.99e-04s = setup: 4.26e-05s + solve: 1.57e-04s
	 lin-sys: 9.64e-06s, cones: 5.84e-05s, accel: 2.69e-06s
------------------------------------------------------------------
objective = 4.000000
------------------------------------------------------------------

The performance is the same as if we solved model_dual, and the correct solution is returned to X:

value(X)

2×2 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
  1.0  -1.0
 -1.0   1.0

dual(VariableInSetRef(X))

2×2 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
 1.0  1.0
 1.0  1.0

dual.(primal_c)

2-element Vector{Float64}:
 2.000000159272681
 2.000000159272681

Moreover, if we use dual_optimizer on model_dual, then we get the same performance as if we had solved model_primal:

set_optimizer(model_dual, Dualization.dual_optimizer(SCS.Optimizer))
optimize!(model_dual)
assert_is_solved_and_feasible(model_dual; dual = true)

------------------------------------------------------------------
	       SCS v3.2.7 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 3, constraints m: 5
cones: 	  z: primal zero / dual free vars: 2
	  s: psd vars: 3, ssize: 1
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
	  alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
	  max_iters: 100000, normalize: 1, rho_x: 1.00e-06
	  acceleration_lookback: 10, acceleration_interval: 10
	  compiled with openmp parallelization enabled
lin-sys:  sparse-direct-amd-qdldl
	  nnz(A): 5, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 1.65e+01  1.60e-01  5.09e+01 -2.91e+01  1.00e-01  2.40e-03
    50| 1.74e-08  2.70e-10  4.88e-08 -4.00e+00  1.00e-01  1.44e-02
------------------------------------------------------------------
status:  solved
timings: total: 1.44e-02s = setup: 4.49e-05s + solve: 1.44e-02s
	 lin-sys: 1.18e-05s, cones: 6.19e-05s, accel: 2.61e-06s
------------------------------------------------------------------
objective = -4.000000
------------------------------------------------------------------

dual(dual_c)

2×2 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
  1.0  -1.0
 -1.0   1.0

value(dual_c)

2×2 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
 1.0  1.0
 1.0  1.0

value.(y)

2-element Vector{Float64}:
 1.9999999997299085
 1.9999999997299085

A mixed example

The Maximum cut via SDP example is nicely defined because the primal is in standard conic form and the dual is in geometric conic form. However, many practical models contain a mix of the two formulations. One example is The minimum distortion problem:

D = [0 1 1 1; 1 0 2 2; 1 2 0 2; 1 2 2 0]
model = Model()
@variable(model, c²)
@variable(model, Q[1:4, 1:4], PSD)
@objective(model, Min, c²)
for i in 1:4, j in (i+1):4
    @constraint(model, D[i, j]^2 <= Q[i, i] + Q[j, j] - 2 * Q[i, j])
    @constraint(model, Q[i, i] + Q[j, j] - 2 * Q[i, j] <= c² * D[i, j]^2)
end
@constraint(model, Q[1, 1] == 0)
@constraint(model, c² >= 1)

\[ c² \geq 1 \]

In this formulation, the Q variable is of the form $x\in\mathcal{K}$, but there is also a free variable, c², a linear equality constraint, Q[1, 1] == 0, and some linear inequality constraints. Rather than attempting to derive the formulation that JuMP would pass to SCS and its dual, the simplest solution is to try solving the problem with and without dual_optimizer to see which formulation is most efficient.

set_optimizer(model, SCS.Optimizer)
optimize!(model)

------------------------------------------------------------------
	       SCS v3.2.7 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 11, constraints m: 24
cones: 	  z: primal zero / dual free vars: 1
	  l: linear vars: 13
	  s: psd vars: 10, ssize: 1
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
	  alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
	  max_iters: 100000, normalize: 1, rho_x: 1.00e-06
	  acceleration_lookback: 10, acceleration_interval: 10
	  compiled with openmp parallelization enabled
lin-sys:  sparse-direct-amd-qdldl
	  nnz(A): 54, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 4.73e+00  1.00e+00  2.92e+00  1.23e+00  1.00e-01  1.36e-04
   150| 1.01e-04  3.07e-05  6.08e-05  1.33e+00  1.00e-01  7.35e-04
------------------------------------------------------------------
status:  solved
timings: total: 7.36e-04s = setup: 6.76e-05s + solve: 6.69e-04s
	 lin-sys: 1.03e-04s, cones: 3.88e-04s, accel: 6.18e-05s
------------------------------------------------------------------
objective = 1.333363
------------------------------------------------------------------

set_optimizer(model, Dualization.dual_optimizer(SCS.Optimizer))
optimize!(model)

------------------------------------------------------------------
	       SCS v3.2.7 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 14, constraints m: 24
cones: 	  z: primal zero / dual free vars: 1
	  l: linear vars: 13
	  s: psd vars: 10, ssize: 1
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
	  alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
	  max_iters: 100000, normalize: 1, rho_x: 1.00e-06
	  acceleration_lookback: 10, acceleration_interval: 10
	  compiled with openmp parallelization enabled
lin-sys:  sparse-direct-amd-qdldl
	  nnz(A): 57, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 3.71e+01  1.48e+00  2.23e+02 -1.13e+02  1.00e-01  1.33e-04
   150| 1.57e-04  2.28e-05  2.08e-04 -1.33e+00  1.00e-01  7.47e-04
------------------------------------------------------------------
status:  solved
timings: total: 7.48e-04s = setup: 6.53e-05s + solve: 6.83e-04s
	 lin-sys: 1.09e-04s, cones: 4.31e-04s, accel: 2.66e-05s
------------------------------------------------------------------
objective = -1.333460
------------------------------------------------------------------

For this problem, SCS reports that the primal has variables n: 11, constraints m: 24 and that the dual has variables n: 14, constraints m: 24. Therefore, we should probably use the primal formulation because it has fewer variables and the same number of constraints.

When to use `dual_optimizer`

Because it can make the problem larger or smaller, depending on the problem and the choice of solver, there is no definitive rule on when you should use dual_optimizer. However, you should try dual_optimizer if your conic optimization problem takes a long time to solve, or if you need to repeatedly solve similarly structured problems with different data. In some cases solving the dual instead of the primal can make a large difference.