Writing a solver interface

This tutorial was generated using Literate.jl. Download the source as a .jl file.

The purpose of this tutorial is to demonstrate how to implement a basic solver interface to MathOptInterface. As a motivating example, we implement the Primal Dual Hybrid Gradient (PDHG) method. PDHG is a first-order method that can solve convex optimization problems.

Google has a good introduction to the math behind PDLP, which is a variant of PDHG specialized for linear programs.

Required packages

This tutorial requires the following packages:

using JuMP
import LinearAlgebra
import MathOptInterface as MOI
import Printf
import SparseArrays

Primal Dual Hybrid Gradient

The following function is a pedagogical implementation of PDHG that solves the linear program:

\[\begin{aligned} \text{min} \ & c^\top x \\ \text{subject to} \ & Ax = b \\ & x \ge 0. \end{aligned}\]

Note that this implementation is intentionally kept simple. It is not robust nor efficient, and it does not incorporate the theoretical improvements in the PDLP paper. It does use two workspace vectors so that the body of the iteration loop is non-allocating.

function solve_pdhg(
    A::SparseArrays.SparseMatrixCSC{Float64,Int},
    b::Vector{Float64},
    c::Vector{Float64};
    maximum_iterations::Int = 100_000,
    tol::Float64 = 1e-4,
    verbose::Bool = true,
    log_frequency::Int = 1_000,
)
    printf(x::Float64) = Printf.@sprintf("% 1.6e", x)
    printf(x::Int) = Printf.@sprintf("%6d", x)
    m, n = size(A)
    η = τ = 1 / LinearAlgebra.norm(A) - 1e-6
    x, x_next, y, k, status = zeros(n), zeros(n), zeros(m), 0, MOI.OTHER_ERROR
    m_workspace, n_workspace = zeros(m), zeros(n)
    if verbose
        println(
            "  iter      pobj          dobj         pfeas         dfeas       objfeas",
        )
    end
    while status == MOI.OTHER_ERROR
        k += 1
        # =====================================================================
        # This block computes x_next = max.(0.0, x - η * (A' * y + c))
        LinearAlgebra.mul!(x_next, A', y)
        LinearAlgebra.axpby!(-η, c, -η, x_next)
        x_next .+= x
        x_next .= max.(0.0, x_next)
        # =====================================================================
        # This block computes y += τ * (A * (2 * x_next - x) - b)
        copy!(n_workspace, x_next)
        LinearAlgebra.axpby!(-1.0, x, 2.0, n_workspace)
        LinearAlgebra.mul!(y, A, n_workspace, τ, 1.0)
        LinearAlgebra.axpy!(-τ, b, y)
        # =====================================================================
        copy!(x, x_next)
        # =====================================================================
        # This block computes pfeas = LinearAlgebra.norm(A * x - b)
        LinearAlgebra.mul!(m_workspace, A, x)
        m_workspace .-= b
        pfeas = LinearAlgebra.norm(m_workspace)
        # =====================================================================
        # This block computes dfeas = LinearAlgebra.norm(min.(0.0, A' * y + c))
        LinearAlgebra.mul!(n_workspace, A', y)
        n_workspace .+= c
        n_workspace .= min.(0.0, n_workspace)
        dfeas = LinearAlgebra.norm(n_workspace)
        # =====================================================================
        objfeas = abs(LinearAlgebra.dot(c, x) + LinearAlgebra.dot(b, y))
        if pfeas <= tol && dfeas <= tol && objfeas <= tol
            status = MOI.OPTIMAL
        elseif k == maximum_iterations
            status = MOI.ITERATION_LIMIT
        end
        if verbose && (mod(k, log_frequency) == 0 || status != MOI.OTHER_ERROR)
            logs = printf.((k, c' * x, -b' * y, pfeas, dfeas, objfeas))
            println(join(logs, " "))
        end
    end
    return status, k, x, y
end
solve_pdhg (generic function with 1 method)

Here's an example:

A = [0.0 -1.0 -1.0 0.0 0.0; 6.0 8.0 0.0 -1.0 0.0; 7.0 12.0 0.0 0.0 -1.0]
b = [-3.0, 100.0, 120.0]
c = [12.0, 20.0, 0.0, 0.0, 0.0]
status, k, x, y = solve_pdhg(SparseArrays.sparse(A), b, c);
  iter      pobj          dobj         pfeas         dfeas       objfeas
  1000  2.050187e+02  2.044002e+02  2.006420e-01  2.674295e-02  6.185366e-01
  2000  2.049895e+02  2.051241e+02  1.705136e-02  2.746207e-02  1.346232e-01
  3000  2.050050e+02  2.050805e+02  8.907061e-03  8.405470e-03  7.550812e-02
  4000  2.050024e+02  2.049755e+02  4.046623e-03  9.374982e-04  2.689079e-02
  5000  2.049995e+02  2.049831e+02  8.635908e-04  6.483234e-04  1.633722e-02
  6000  2.049995e+02  2.050016e+02  7.833794e-04  1.676266e-04  2.095135e-03
  7000  2.050000e+02  2.050030e+02  2.811341e-05  3.065459e-04  2.964863e-03
  8000  2.050001e+02  2.050002e+02  1.316682e-04  1.674879e-05  8.453982e-05
  8365  2.049999e+02  2.050000e+02  9.473351e-05  2.381905e-06  7.955573e-05

The termination status is:

status
OPTIMAL::TerminationStatusCode = 1

The solve took the following number of iterations:

k
8365

The primal solution is:

x
5-element Vector{Float64}:
 15.000070735249105
  1.2499547393201815
  1.750098059200116
  0.0
  0.0

The dual multipliers are:

y
3-element Vector{Float64}:
  1.931930909998547e-6
 -0.2500022801722057
 -1.4999982446009117

The MOI interface

Converting a linear program from the modeler's form into the A, b, and c matrices of the standard form required by our implementation of PDHG is tedious and error-prone. This section walks through how to implement a basic interface to MathOptInterface, so that we can use our algorithm from JuMP.

For a more comprehensive guide, see Implementing a solver interface.

The Optimizer type

Create an optimizer by subtyping MOI.AbstractOptimizer. By convention, the name of this type is Optimizer, and most optimizers are available as PackageName.Optimizer.

The fields inside the optimizer are arbitrary. Store whatever is useful.

"""
    Optimizer()

Create a new optimizer for PDHG.
"""
mutable struct Optimizer <: MOI.AbstractOptimizer
    # A mapping from variable to column
    x_to_col::Dict{MOI.VariableIndex,Int}
    # A mapping from constraint to rows
    ci_to_rows::Dict{
        MOI.ConstraintIndex{MOI.VectorAffineFunction{Float64},MOI.Zeros},
        Vector{Int},
    }
    # Information from solve_pdhg
    status::MOI.TerminationStatusCode
    iterations::Int
    x::Vector{Float64}
    y::Vector{Float64}
    # Other useful quantities
    solve_time::Float64
    obj_value::Float64

    function Optimizer()
        F = MOI.VectorAffineFunction{Float64}
        return new(
            Dict{MOI.VariableIndex,Int}(),
            Dict{MOI.ConstraintIndex{F,MOI.Zeros},Vector{Int}}(),
            MOI.OPTIMIZE_NOT_CALLED,
            0,
            Float64[],
            Float64[],
            0.0,
            0.0,
        )
    end
end
Main.Optimizer

Now that we have an Optimizer, we need to implement two methods: MOI.is_empty and MOI.empty!. These are called whenever MOI needs to ensure that the optimizer is in a clean state.

function MOI.is_empty(model::Optimizer)
    # You might want to check every field, not just a few
    return isempty(model.x_to_col) && model.status == MOI.OPTIMIZE_NOT_CALLED
end

function MOI.empty!(model::Optimizer)
    empty!(model.x_to_col)
    empty!(model.ci_to_rows)
    model.status = MOI.OPTIMIZE_NOT_CALLED
    model.iterations = 0
    model.solve_time = 0.0
    model.obj_value = 0.0
    empty!(model.x)
    empty!(model.y)
    return
end

Next, we need to define what constraints the optimizer supports. Since our standard form was $Ax = b$, we support only $Ax + b \in \{0\}$, which is a MOI.VectorAffineFunction in MOI.Zeros constraint. Note that you might have expected $Ax - b \in \{0\}$. We'll address the difference in the sign of b in a few places later on.

function MOI.supports_constraint(
    ::Optimizer,
    ::Type{MOI.VectorAffineFunction{Float64}},
    ::Type{MOI.Zeros},
)
    return true
end

By default, MOI assumes that it can add free variables. This isn't true for our standard form, because we support only $x \ge 0$. Let's tell MOI that:

MOI.supports_add_constrained_variables(::Optimizer, ::Type{MOI.Reals}) = false

function MOI.supports_add_constrained_variables(
    ::Optimizer,
    ::Type{MOI.Nonnegatives},
)
    return true
end

The objective function that we support is MOI.ScalarAffineFunction:

function MOI.supports(
    ::Optimizer,
    ::MOI.ObjectiveFunction{MOI.ScalarAffineFunction{Float64}},
)
    return true
end

Finally, we'll implement MOI.SolverName so that MOI knows how to print the name of our optimizer:

MOI.get(::Optimizer, ::MOI.SolverName) = "PDHG"

GenericModel

The simplest way to solve a problem with your optimizer is to implement the method MOI.optimize!(dest::Optimizer, src::MOI.ModelLike), where src is an input model and dest is your empty optimizer.

To implement this method you would need to query the variables and constraints in src and the convert these into the matrix data expected by solve_pdhg. Since matrix input is a common requirement of solvers, MOI includes utilities to simplify the process.

The downside of the utilities is that they involve a highly parameterized type with a large number of possible configurations.The upside of the utilities is that, once setup, they requires few lines of code to extract the problem matrices.

First, we need to define the set of sets that our standard form supports. For PDHG, we support only Ax + b in {0}:

MOI.Utilities.@product_of_sets(SetOfZeros, MOI.Zeros)

Then, we define a MOI.Utilities.GenericModel. This is the highly parameterized type that can be customized.

const CacheModel = MOI.Utilities.GenericModel{
    # The coefficient type is Float64
    Float64,
    # We use the default objective container
    MOI.Utilities.ObjectiveContainer{Float64},
    # We use the default variable container
    MOI.Utilities.VariablesContainer{Float64},
    # We use a Matrix of Constraints to represent `A * x + b in K`
    MOI.Utilities.MatrixOfConstraints{
        # The number type is Float64
        Float64,
        # The matrix type `A` is a sparse matrix
        MOI.Utilities.MutableSparseMatrixCSC{
            # ... with Float64 coefficients
            Float64,
            # ... Int64 row and column indices
            Int,
            # ... and it uses one-based indexing
            MOI.Utilities.OneBasedIndexing,
        },
        # The vector type `b` is a Julia `Vector`
        Vector{Float64},
        # The set type `K` is the SetOfZeros type we defined above
        SetOfZeros{Float64},
    },
}
MathOptInterface.Utilities.GenericModel{Float64, MathOptInterface.Utilities.ObjectiveContainer{Float64}, MathOptInterface.Utilities.VariablesContainer{Float64}, MathOptInterface.Utilities.MatrixOfConstraints{Float64, MathOptInterface.Utilities.MutableSparseMatrixCSC{Float64, Int64, MathOptInterface.Utilities.OneBasedIndexing}, Vector{Float64}, Main.SetOfZeros{Float64}}}

As one example of possible alternate configuration, if you were interfacing with a solver written in C that expected zero-based indices, you might use instead:

MOI.Utilities.MutableSparseMatrixCSC{
    Cdouble,
    Cint,
    MOI.Utilities.ZeroBasedIndexing,
}
MathOptInterface.Utilities.MutableSparseMatrixCSC{Float64, Int32, MathOptInterface.Utilities.ZeroBasedIndexing}
Tip

The best place to look at how to configure GenericModel is to find an existing solver with the same input standard form that you require.

We need to make one modification to CacheModel to tell MOI that $x \in \mathbb{R}_+$ is equivalent to adding variables in MOI.GreaterThan:

function MOI.add_constrained_variables(model::CacheModel, set::MOI.Nonnegatives)
    x = MOI.add_variables(model, MOI.dimension(set))
    MOI.add_constraint.(model, x, MOI.GreaterThan(0.0))
    ci = MOI.ConstraintIndex{MOI.VectorOfVariables,MOI.Nonnegatives}(x[1].value)
    return x, ci
end

The optimize method

Now we define the most important method for our optimizer.

function MOI.optimize!(dest::Optimizer, src::MOI.ModelLike)
    # In addition to the values returned by `solve_pdhg`, it may be useful to
    # record other attributes, such as the solve time.
    start_time = time()
    # Construct a cache to store our problem data:
    cache = CacheModel()
    # MOI includes a utility to copy an arbitrary `src` model into `cache`. The
    # return, `index_map`, is a mapping from indices in `src` to indices in
    # `dest`.
    index_map = MOI.copy_to(cache, src)
    # Now we can access the `A` matrix:
    A = convert(
        SparseArrays.SparseMatrixCSC{Float64,Int},
        cache.constraints.coefficients,
    )
    # and the b vector (note that MOI models Ax = b as Ax + b in {0}, so b
    # differs by -):
    b = -cache.constraints.constants
    # The `c` vector is more involved, because we need to account for the
    # objective sense:
    sense = ifelse(cache.objective.sense == MOI.MAX_SENSE, -1, 1)
    F = MOI.ScalarAffineFunction{Float64}
    obj = MOI.get(src, MOI.ObjectiveFunction{F}())
    c = zeros(size(A, 2))
    for term in obj.terms
        c[term.variable.value] += sense * term.coefficient
    end
    # Now we can solve the problem with PDHG and record the solution:
    dest.status, dest.iterations, dest.x, dest.y = solve_pdhg(A, b, c)
    # To help assign the values of the x and y vectors to the appropriate
    # variables and constrats, we need a map of the constraint indices to their
    # row in the `dest` matrix and a map of the variable indices to their
    # column in the `dest` matrix:
    F, S = MOI.VectorAffineFunction{Float64}, MOI.Zeros
    for src_ci in MOI.get(src, MOI.ListOfConstraintIndices{F,S}())
        dest.ci_to_rows[index_map[src_ci]] =
            MOI.Utilities.rows(cache.constraints.sets, index_map[src_ci])
    end
    for (i, src_x) in enumerate(MOI.get(src, MOI.ListOfVariableIndices()))
        dest.x_to_col[index_map[src_x]] = i
    end
    # We can now record two derived quantities: the primal objective value and
    # the solve time.
    dest.obj_value = obj.constant + sense * c' * dest.x
    dest.solve_time = time() - start_time
    # We need to return the index map, and `false`, to indicate to MOI that we
    # do not support incremental modification of the model.
    return index_map, false
end

Solutions

Now that we know how to solve a model, let's implement the required solution attributes.

First, we need to tell MOI how many solutions we found via MOI.ResultCount:

function MOI.get(model::Optimizer, ::MOI.ResultCount)
    return model.status == MOI.OPTIMAL ? 1 : 0
end

and implement MOI.RawStatusString to provide a user-readable string that describes what happened:

function MOI.get(model::Optimizer, ::MOI.RawStatusString)
    if model.status == MOI.OPTIMAL
        return "found a primal-dual optimal solution (subject to tolerances)"
    end
    return "failed to solve"
end

Then, we need to implement the three types of problem status: MOI.TerminationStatus, MOI.PrimalStatus and MOI.DualStatus:

MOI.get(model::Optimizer, ::MOI.TerminationStatus) = model.status

function MOI.get(model::Optimizer, attr::Union{MOI.PrimalStatus,MOI.DualStatus})
    if attr.result_index == 1 && model.status == MOI.OPTIMAL
        return MOI.FEASIBLE_POINT
    end
    return MOI.NO_SOLUTION
end

Now we can implement MOI.ObjectiveValue, MOI.VariablePrimal, and MOI.ConstraintDual:

function MOI.get(model::Optimizer, attr::MOI.ObjectiveValue)
    MOI.check_result_index_bounds(model, attr)
    return model.obj_value
end

function MOI.get(
    model::Optimizer,
    attr::MOI.VariablePrimal,
    x::MOI.VariableIndex,
)
    MOI.check_result_index_bounds(model, attr)
    return model.x[model.x_to_col[x]]
end

function MOI.get(
    model::Optimizer,
    attr::MOI.ConstraintDual,
    ci::MOI.ConstraintIndex{MOI.VectorAffineFunction{Float64},MOI.Zeros},
)
    MOI.check_result_index_bounds(model, attr)
    # MOI models Ax = b as Ax + b in {0}, so the dual differs by -
    return -model.y[model.ci_to_rows[ci]]
end

Some other useful result quantities are MOI.SolveTimeSec and MOI.BarrierIterations:

MOI.get(model::Optimizer, ::MOI.SolveTimeSec) = model.solve_time
MOI.get(model::Optimizer, ::MOI.BarrierIterations) = model.iterations

A JuMP example

Now we can solve an arbitrary linear program with JuMP. Here's the same standard form as before:

model = Model(Optimizer)
@variable(model, x[1:5] >= 0)
@objective(model, Min, c' * x)
@constraint(model, c3, A * x == b)
optimize!(model)
  iter      pobj          dobj         pfeas         dfeas       objfeas
  1000  2.050187e+02  2.044002e+02  2.006420e-01  2.674295e-02  6.185366e-01
  2000  2.049895e+02  2.051241e+02  1.705136e-02  2.746207e-02  1.346232e-01
  3000  2.050050e+02  2.050805e+02  8.907061e-03  8.405470e-03  7.550812e-02
  4000  2.050024e+02  2.049755e+02  4.046623e-03  9.374982e-04  2.689079e-02
  5000  2.049995e+02  2.049831e+02  8.635908e-04  6.483234e-04  1.633722e-02
  6000  2.049995e+02  2.050016e+02  7.833794e-04  1.676266e-04  2.095135e-03
  7000  2.050000e+02  2.050030e+02  2.811341e-05  3.065459e-04  2.964863e-03
  8000  2.050001e+02  2.050002e+02  1.316682e-04  1.674879e-05  8.453982e-05
  8365  2.049999e+02  2.050000e+02  9.473351e-05  2.381905e-06  7.955573e-05
solution_summary(model; verbose = true)
* Solver : PDHG

* Status
  Result count       : 1
  Termination status : OPTIMAL
  Message from the solver:
  "found a primal-dual optimal solution (subject to tolerances)"

* Candidate solution (result #1)
  Primal status      : FEASIBLE_POINT
  Dual status        : FEASIBLE_POINT
  Objective value    : 2.05000e+02
  Dual objective value : 2.05000e+02
  Primal solution :
    x[1] : 1.50001e+01
    x[2] : 1.24995e+00
    x[3] : 1.75010e+00
    x[4] : 0.00000e+00
    x[5] : 0.00000e+00
  Dual solution :
    c3 : [-1.93193e-06,2.50002e-01,1.50000e+00]

* Work counters
  Solve time (sec)   : 2.38002e-01
  Barrier iterations : 8365

But we could also have written:

model = Model(Optimizer)
@variable(model, x >= 0)
@variable(model, 0 <= y <= 3)
@objective(model, Min, 12x + 20y)
@constraint(model, c1, 6x + 8y >= 100)
@constraint(model, c2, 7x + 12y >= 120)
optimize!(model)
  iter      pobj          dobj         pfeas         dfeas       objfeas
  1000  2.050187e+02  2.044002e+02  2.006420e-01  2.674295e-02  6.185366e-01
  2000  2.049895e+02  2.051241e+02  1.705136e-02  2.746207e-02  1.346232e-01
  3000  2.050050e+02  2.050805e+02  8.907061e-03  8.405470e-03  7.550812e-02
  4000  2.050024e+02  2.049755e+02  4.046623e-03  9.374982e-04  2.689079e-02
  5000  2.049995e+02  2.049831e+02  8.635908e-04  6.483234e-04  1.633722e-02
  6000  2.049995e+02  2.050016e+02  7.833794e-04  1.676266e-04  2.095135e-03
  7000  2.050000e+02  2.050030e+02  2.811341e-05  3.065459e-04  2.964863e-03
  8000  2.050001e+02  2.050002e+02  1.316682e-04  1.674879e-05  8.453982e-05
  8365  2.049999e+02  2.050000e+02  9.473351e-05  2.381905e-06  7.955573e-05
solution_summary(model; verbose = true)
* Solver : PDHG

* Status
  Result count       : 1
  Termination status : OPTIMAL
  Message from the solver:
  "found a primal-dual optimal solution (subject to tolerances)"

* Candidate solution (result #1)
  Primal status      : FEASIBLE_POINT
  Dual status        : FEASIBLE_POINT
  Objective value    : 2.05000e+02
  Dual objective value : 2.05000e+02
  Primal solution :
    x : 1.50001e+01
    y : 1.24995e+00
  Dual solution :
    c1 : 2.50002e-01
    c2 : 1.50000e+00

* Work counters
  Solve time (sec)   : 1.86300e-03
  Barrier iterations : 8365

Other variations are also possible:

model = Model(Optimizer)
@variable(model, x[1:5] >= 0)
@objective(model, Max, -c' * x)
@constraint(model, c4, A * x .== b)
optimize!(model)
  iter      pobj          dobj         pfeas         dfeas       objfeas
  1000  2.050187e+02  2.044002e+02  2.006420e-01  2.674295e-02  6.185366e-01
  2000  2.049895e+02  2.051241e+02  1.705136e-02  2.746207e-02  1.346232e-01
  3000  2.050050e+02  2.050805e+02  8.907061e-03  8.405470e-03  7.550812e-02
  4000  2.050024e+02  2.049755e+02  4.046623e-03  9.374982e-04  2.689079e-02
  5000  2.049995e+02  2.049831e+02  8.635908e-04  6.483234e-04  1.633722e-02
  6000  2.049995e+02  2.050016e+02  7.833794e-04  1.676266e-04  2.095135e-03
  7000  2.050000e+02  2.050030e+02  2.811341e-05  3.065459e-04  2.964863e-03
  8000  2.050001e+02  2.050002e+02  1.316682e-04  1.674879e-05  8.453982e-05
  8365  2.049999e+02  2.050000e+02  9.473351e-05  2.381905e-06  7.955573e-05
solution_summary(model; verbose = true)
* Solver : PDHG

* Status
  Result count       : 1
  Termination status : OPTIMAL
  Message from the solver:
  "found a primal-dual optimal solution (subject to tolerances)"

* Candidate solution (result #1)
  Primal status      : FEASIBLE_POINT
  Dual status        : FEASIBLE_POINT
  Objective value    : -2.05000e+02
  Dual objective value : -2.05000e+02
  Primal solution :
    x[1] : 1.50001e+01
    x[2] : 1.24995e+00
    x[3] : 1.75010e+00
    x[4] : 0.00000e+00
    x[5] : 0.00000e+00
  Dual solution :
    c4 : multiple constraints with the same name

* Work counters
  Solve time (sec)   : 1.85108e-03
  Barrier iterations : 8365

Behind the scenes, JuMP and MathOptInterface reformulate the problem from the modeller's form into the standard form defined by our Optimizer.