JuMP, GAMS, and the IJKLM model


Author: Miles Lubin, Oscar Dowson, Joaquim Dias Garcia, Joey Huchette, Benoît Legat

A recent blog post by GAMS demonstrated a significant performance difference between JuMP and GAMS on a model they call IJKLM. We respond to this blog post by explaining the difference in performance and presenting an alternative JuMP implementation with asymptotically better performance. We also identify that differences in the input data format—not anything intrinsic to the respective libraries—explain the performance difference between the Pyomo model and the JuMP model. Finally, we respond to general claims about domain specific languages and the trade-offs of developing optimization models within a fully-featured programming language like Julia or Python.

The model and input data

Read the GAMS post for background on the model, as well as its mathematical formulation.

Input data for the model is provided in three JSON files, labeled IJK, JKL, and KLM.

import JSON
const DATA_DIR = joinpath(@__DIR__, "gams");
struct OriginalData
    I::Vector{String}
    IJK::Vector{Vector{Any}}
    JKL::Vector{Vector{Any}}
    KLM::Vector{Vector{Any}}
    function OriginalData(n::Int)
        return new(
            ["i$i" for i in 1:n],
            JSON.parsefile(joinpath(DATA_DIR, "data_IJK_$n.json")),
            JSON.parsefile(joinpath(DATA_DIR, "data_JKL.json")),
            JSON.parsefile(joinpath(DATA_DIR, "data_KLM.json")),
        )
    end
end

The GAMS blog post generated a range of different sizes, but we’ll just use the n = 2200 case for now, and we’ll test more sizes below. You can use the code in GAMS’s GitHub repository to generate other sizes.

original_data = OriginalData(2_200);

Each element in the IJK, JKL, and KLM vectors is a vector of String with three elements, corresponding to the I-J-K, J-K-L, or K-L-M index.

original_data.IJK[1]
3-element Vector{Any}:
 "i1"
 "j4"
 "k19"

The “intuitive” formulation

The “intuitive” formulation used by GAMS in their blog post is:

using JuMP
import Gurobi
function intuitive_formulation(data)
    x_list = [
        (i, j, k, l, m)
        for (i, j, k) in data.IJK
        for (jj, kk, l) in data.JKL if jj == j && kk == k
        for (kkk, ll, m) in data.KLM if kkk == k && ll == l
    ]
    model = Model(Gurobi.Optimizer)
    set_silent(model)
    @variable(model, x[x_list] >= 0)
    @constraint(
        model,
        [i in data.I],
        sum(
            x[(i, j, k, l, m)]
            for (ii, j, k) in data.IJK if ii == i
            for (jj, kk, l) in data.JKL if jj == j && kk == k
            for (kkk, ll, m) in data.KLM if kkk == k && ll == l
        ) >= 0
    )
    optimize!(model)
    return model
end

On my machine it takes around 4 seconds to run:

@time intuitive_formulation(original_data);
@time intuitive_formulation(original_data);
  4.683633 seconds (12.99 M allocations: 485.201 MiB, 3.82% gc time, 10.05% compilation time)
  3.953564 seconds (12.64 M allocations: 462.332 MiB, 3.76% gc time)

Tweaking the input data

A typical reason for poor performance in Julia code is type instability, that is, code in which the Julia compiler cannot prove the type of a variable.

You can check a function for type stability using @code_warntype intuitive_formulation(original_data) and looking for red inference failures. In our example, there are a lot of issues, all stemming from the use of Vector{Any} in our data structure.

Since we know that each element is actually a list of three String elements, we can improve things by parsing the data into a tuple instead of a vector:

function parsefile_tuple(filename::String)
    return [tuple(x...) for x in JSON.parsefile(filename)]
end
struct TupleData
    I::Vector{String}
    IJK::Vector{NTuple{3,String}}
    JKL::Vector{NTuple{3,String}}
    KLM::Vector{NTuple{3,String}}
    function TupleData(n::Int)
        return new(
            ["i$i" for i in 1:n],
            parsefile_tuple(joinpath(DATA_DIR, "data_IJK_$n.json")),
            parsefile_tuple(joinpath(DATA_DIR, "data_JKL.json")),
            parsefile_tuple(joinpath(DATA_DIR, "data_KLM.json")),
        )
    end
end
tuple_data = TupleData(2_200);

Now if we run the “intuitive” formulation with these performance modifications, it takes around 2 seconds:

@time intuitive_formulation(tuple_data);
@time intuitive_formulation(tuple_data);
  2.308791 seconds (33.92 M allocations: 1.545 GiB, 20.79% gc time, 14.57% compilation time)
  1.951684 seconds (33.68 M allocations: 1.529 GiB, 22.23% gc time)

The “fast” formulation

Next, GAMS looked at ways to improve the JuMP code using feedback from our community forum.

The resulting JuMP code was:

function fast_formulation(data)
    x_list = [
        (i, j, k, l, m)
        for (i, j, k) in data.IJK
        for (jj, kk, l) in data.JKL if jj == j && kk == k
        for (kkk, ll, m) in data.KLM if kkk == k && ll == l
    ]
    model = direct_model(Gurobi.Optimizer())
    set_silent(model)
    set_string_names_on_creation(model, false)
    @variable(model, x[1:length(x_list)] >= 0)
    x_expr = Dict(i => AffExpr(0.0) for i in data.I)
    for (i, index) in enumerate(x_list)
        add_to_expression!(x_expr[index[1]], x[i])
    end
    for expr in values(x_expr)
        @constraint(model, expr in MOI.GreaterThan(0.0))
    end
    optimize!(model)
    return model
end

There are a few things to notice here:

These improvements bring the time down to around 1 second.

@time fast_formulation(tuple_data);
@time fast_formulation(tuple_data);
  1.386456 seconds (33.37 M allocations: 1.512 GiB, 31.20% gc time, 23.27% compilation time)
  1.006818 seconds (33.12 M allocations: 1.495 GiB, 39.34% gc time)

Why, then, is GAMS so much faster in their benchmark?

The answer is the nested for-loop used to create the list of indices takes nearly all the total time:

function x_list_only(data)
    return [
        (i, j, k, l, m)
        for (i, j, k) in data.IJK
        for (jj, kk, l) in data.JKL if jj == j && kk == k
        for (kkk, ll, m) in data.KLM if kkk == k && ll == l
    ]
end
@time x_list_only(tuple_data);
@time x_list_only(tuple_data);
  1.134260 seconds (33.10 M allocations: 1.487 GiB, 34.83% gc time, 19.77% compilation time)
  0.869116 seconds (32.93 M allocations: 1.476 GiB, 41.44% gc time)

With a little effort, we can realize that the for loops are equivalent to treating each of the IJK, JKL, and KLM lists as a table in a database and performing an inner join across the similar indices.

The blog post hints at this, saying “The reason for GAMS superior performance in this example is the use of relational algebra.” But relational algebra, while not built-in to JuMP, is not unique to the GAMS modeling language. In Julia, we can use the DataFrames library.

The DataFrames formulation

The first step is to load the data as a dataframe instead of a list of tuples:

import DataFrames
function parsefile_dataframe(filename::String, indices)
    list = parsefile_tuple(filename)
    return DataFrames.DataFrame(
        [index => getindex.(list, i) for (i, index) in enumerate(indices)]...
    )
end
struct DataFrameData
    I::Vector{String}
    IJK::DataFrames.DataFrame
    JKL::DataFrames.DataFrame
    KLM::DataFrames.DataFrame
    function DataFrameData(n::Int)
        return new(
            ["i$i" for i in 1:n],
            parsefile_dataframe(joinpath(DATA_DIR, "data_IJK_$n.json"), (:i, :j, :k,)),
            parsefile_dataframe(joinpath(DATA_DIR, "data_JKL.json"), (:j, :k, :l)),
            parsefile_dataframe(joinpath(DATA_DIR, "data_KLM.json"), (:k, :l, :m)),
        )
    end
end
dataframe_data = DataFrameData(2_200);

Using the dataframe data structure, we can compactly write an equivalent JuMP formulation:

function dataframe_formulation(data::DataFrameData)
    ijklm = DataFrames.innerjoin(
        DataFrames.innerjoin(data.IJK, data.JKL; on = [:j, :k]),
        data.KLM;
        on = [:k, :l],
    )
    model = Model(Gurobi.Optimizer)
    set_silent(model)
    ijklm[!, :x] = @variable(model, x[1:size(ijklm, 1)] >= 0)
    for df in DataFrames.groupby(ijklm, :i)
        @constraint(model, sum(df.x) >= 0)
    end
    optimize!(model)
    return model
end

This formulation doesn’t look like the nested summation mathematics that GAMS originally formulated their model as, but it is arguably just as readable, particularly if the IJKLM columns were meaningfully related to the business logic.

This is much faster, taking just over 0.1 seconds:

@time dataframe_formulation(dataframe_data);
@time dataframe_formulation(dataframe_data);
  0.156004 seconds (461.83 k allocations: 51.187 MiB, 14.82% gc time, 20.81% compilation time)
  0.125839 seconds (435.75 k allocations: 49.386 MiB, 21.57% gc time)

Scaling

Let’s now compare the four different formulations over a range of n values:

import Plots
function timings()
    N = [100, 200, 400, 700, 1_100, 1_600, 2_200, 2_900]
    time_original = Float64[]
    time_intuitive = Float64[]
    time_fast = Float64[]
    time_dataframe = Float64[]
    for n in N
        # Original model
        original_data = OriginalData(n)
        start = time()
        intuitive_formulation(original_data)
        push!(time_original, time() - start)
        # Tuple models
        tuple_data = TupleData(n)
        start = time()
        intuitive_formulation(tuple_data)
        push!(time_intuitive, time() - start)
        start = time()
        fast_formulation(tuple_data)
        push!(time_fast, time() - start)
        # DataFrame model
        dataframe_data = DataFrameData(n)
        start = time()
        dataframe_formulation(dataframe_data)
        push!(time_dataframe, time() - start)
    end
    return Plots.plot(
        N,
        [time_original time_intuitive time_fast time_dataframe];
        labels = hcat(
            "\"Intuitive\" (not type stable)",
            "\"Intuitive\" (type stable)",
            "\"Fast\"",
            "DataFrame",
        ),
        xlabel = "N",
        ylabel = "Solution time (s)",
    )
end
timings()

The dataframe formulation is asymptotically faster. Once one understands that the bottleneck in this benchmark is equivalent to an inner join, it is not difficult to address it, given that general-purpose languages like Julia and Python have libraries specialized for this task. Pyomo and gurobipy would likely benefit from a similar optimization.

Other comments

There are a few points in the blog post that deserve some rebuttal.

First, the quantitative results in Figure 4 are misleading because they are not timing equivalent implementations in each of the modeling systems. For example, as outlined above, the “Fast JuMP” implementation receives a vector of tuples as input for JKL and KLM, but the Pyomo implementation receives a dictionary mapping the first two indices to a vector of the third index. Since the join is the bottleneck, this difference has a material impact on the total solve time.

One of the key differences between GAMS and the other modeling frameworks we’ve mentioned is that GAMS is a domain-specific language

JuMP is, in fact, a domain-specific language, one that is embedded in a programming language. Indeed, a key feature of JuMP is that the code users write inside the modeling macros (the identifiers beginning with @) is not what gets executed. Instead, JuMP parses the syntax that the user writes and compiles it into a different form that is more efficient.

While it’s true that general-purpose programming languages offer more flexibility and control, it’s important to consider the trade-offs. With general-purpose languages like Python and Julia, a straightforward implementation closely aligned with the mathematical formulation is often self-evident and easier to implement, read, and maintain, but suffers from inadequate performance.

This point is perhaps a matter of taste and personal experience. In our opinion, the GAMS syntax of

ei(i).. sum((IJK(i,j,k),JKL(j,k,l),KLM(k,l,m)), x(i,j,k,l,m)) =g= 0;

is not more readable or easier to maintain than the performant dataframe_formulation implementation. In this case, viewing the problem as an optimization over three sets IJK, JKL, and KLM is more cumbersome than a single joined IJKLM table with one variable for each row and a single groupby constraint on the I indices.

Flexibility is also a double-edged sword. While it offers many different ways to accomplish a task, there is also the risk of implementing a solution that is not efficient. And determining the optimal approach is a challenging task in itself. All of the discussed modeling frameworks allow a more or less and depending on personal taste intuitive implementation of our example’s model. However, intuitive solutions do not always turn out to be efficient. With additional research and effort, it is possible to find alternative implementations that outperform the intuitive approach, as Figure 2 presents for JuMP.

This is a fair point. It’s obviously true that the added flexibility of a full programming language increases the risk of implementing a solution that is not efficient. But this is true of any computational problem. Indeed, the bottleneck in this example relates to an inner join on two tables, which would also arise if the user was exploring summary statistics or implementing a heuristic to solve this problem.

A feature of Julia is that you can smoothly transition from the unoptimized code to the much more efficient code while staying in the same language.

Conclusion

Comparing the trade-offs of different modeling systems is a useful endeavor. We (the JuMP developers) learnt a lot by investigating the IJKLM model, but our biggest lesson is that such comparisons are difficult to engineer in a fair way without open review and feedback from all sides prior to publication. If anyone reading this is working on similar comparisons, the best place to reach out is by starting a thread on our community forum.