The network multi-commodity flow problem

This tutorial was generated using Literate.jl. Download the source as a .jl file.

This tutorial is a variation of The multi-commodity flow problem where the graph is a network instead of a bipartite graph.

The purpose of this tutorial is to demonstrate a style of modeling that uses relational algebra.

Required packages

This tutorial uses the following packages:

using JuMP
import DataFrames
import HiGHS
import SQLite
import SQLite.DBInterface
import Test

Formulation

The network multi-commondity flow problem is an extension of the The multi-commodity flow problem, where instead of having a bipartite graph of supply and demand nodes, the graph can contains a set of nodes, $i \in \mathcal{N}$ , which each have a (potentially zero) supply capacity, $u^s_{i,p}$, and (potentially zero) a demand, $d_{i,p}$ for each commodity $p \in P$. The nodes are connected by a set of edges $(i, j) \in \mathcal{E}$, which have a shipment cost $c^x_{i,j,p}$ and a total flow capacity of $u^x_{i,j}$.

Our take is to choose an optimal supply for each node $s_{i,p}$, as well as the optimal transshipment $x_{i,j,p}$ that minimizes the total cost.

The mathematical formulation is:

\[\begin{aligned} \min \;\; & \sum_{(i,j)\in\mathcal{E}, p \in P} c^x_{i,j,p} x_{i,j,p} + \sum_{i\in\mathcal{N}, p \in P} c^s_{i,p} s_{i,p} \\ s.t. \;\; & s_{i,p} + \sum_{(j, i) \in \mathcal{E}} x_{j,i,p} - \sum_{(i,j) \in \mathcal{E}} x_{i,j,p} = d_{i,p} & \forall i \in \mathcal{N}, p \in P \\ & x_{i,j,p} \ge 0 & \forall (i, j) \in \mathcal{E}, p \in P \\ & \sum_{p \in P} x_{i,j,p} \le u^x_{i,j} & \forall (i, j) \in \mathcal{E} \\ & 0 \le s_{i,p} \le u^s_{i,p} & \forall i \in \mathcal{N}, p \in P. \end{aligned}\]

The purpose of this tutorial is to demonstrate how this model can be built using relational algebra instead of a direct math-to-code translation of the summations.

Data

For the purpose of this tutorial, the JuMP repository contains an example database called commodity_nz.db:

filename = joinpath(@__DIR__, "commodity_nz.db");

To run locally, download commodity_nz.db and update filename appropriately.

Load the database using SQLite.DB:

db = SQLite.DB(filename)
SQLite.DB("/home/runner/work/JuMP.jl/JuMP.jl/docs/build/tutorials/linear/commodity_nz.db")

A quick way to see the schema of the database is via SQLite.tables:

SQLite.tables(db)
4-element Vector{SQLite.DBTable}:
 SQLite.DBTable("products", Tables.Schema:
 :product      Union{Missing, String}
 :cost_per_km  Union{Missing, Float64})
 SQLite.DBTable("shipping", Tables.Schema:
 :origin       Union{Missing, String}
 :destination  Union{Missing, String}
 :product      Union{Missing, String}
 :distance_km  Union{Missing, Float64})
 SQLite.DBTable("supply", Tables.Schema:
 :origin    Union{Missing, String}
 :product   Union{Missing, String}
 :capacity  Union{Missing, Float64}
 :cost      Union{Missing, Float64})
 SQLite.DBTable("demand", Tables.Schema:
 :destination  Union{Missing, String}
 :product      Union{Missing, String}
 :demand       Union{Missing, Float64})

We interact with the database by executing queries and then loading the results into a DataFrame:

function get_table(db, table)
    query = DBInterface.execute(db, "SELECT * FROM $table")
    return DataFrames.DataFrame(query)
end
get_table (generic function with 1 method)

The shipping table contains the set of arcs and their distances:

df_shipping = get_table(db, "shipping")
16×4 DataFrame
Roworigindestinationproductdistance_km
StringStringStringFloat64
1aucklandwaikatomilk112.0
2aucklandtaurangamilk225.0
3aucklandchristchurchmilk1070.0
4waikatoaucklandmilk112.0
5waikatotaurangamilk107.0
6waikatowellingtonmilk392.0
7taurangaaucklandmilk225.0
8taurangawaikatomilk107.0
9christchurchaucklandmilk1070.0
10aucklandwaikatokiwifruit112.0
11aucklandchristchurchkiwifruit1070.0
12waikatoaucklandkiwifruit112.0
13waikatowellingtonkiwifruit392.0
14taurangaaucklandkiwifruit225.0
15taurangawaikatokiwifruit107.0
16christchurchaucklandkiwifruit1070.0

The products table contains the shipping cost per kilometer of each product:

df_products = get_table(db, "products")
2×2 DataFrame
Rowproductcost_per_km
StringFloat64
1milk0.001
2kiwifruit0.01

The supply table contains the supply capacity of each node, as well as the cost:

df_supply = get_table(db, "supply")
4×4 DataFrame
Roworiginproductcapacitycost
StringStringFloat64Float64
1waikatomilk10.00.5
2taurangamilk6.01.0
3taurangakiwifruit26.01.0
4christchurchmilk10.00.6

The demand table contains the demand of each node:

df_demand = get_table(db, "demand")
8×3 DataFrame
Rowdestinationproductdemand
StringStringFloat64
1aucklandmilk16.0
2aucklandkiwifruit16.0
3taurangamilk2.0
4taurangakiwifruit2.0
5wellingtonmilk2.0
6wellingtonkiwifruit2.0
7christchurchmilk4.0
8christchurchkiwifruit4.0

JuMP formulation

We start by creating a model and our decision variables:

model = Model(HiGHS.Optimizer)
set_silent(model)

For the shipping decisions, we create a new column in df_shipping called x_flow, which has one non-negative decision variable for each row:

df_shipping.x_flow = @variable(model, x[1:size(df_shipping, 1)] >= 0)
df_shipping
16×5 DataFrame
Roworigindestinationproductdistance_kmx_flow
StringStringStringFloat64GenericV…
1aucklandwaikatomilk112.0x[1]
2aucklandtaurangamilk225.0x[2]
3aucklandchristchurchmilk1070.0x[3]
4waikatoaucklandmilk112.0x[4]
5waikatotaurangamilk107.0x[5]
6waikatowellingtonmilk392.0x[6]
7taurangaaucklandmilk225.0x[7]
8taurangawaikatomilk107.0x[8]
9christchurchaucklandmilk1070.0x[9]
10aucklandwaikatokiwifruit112.0x[10]
11aucklandchristchurchkiwifruit1070.0x[11]
12waikatoaucklandkiwifruit112.0x[12]
13waikatowellingtonkiwifruit392.0x[13]
14taurangaaucklandkiwifruit225.0x[14]
15taurangawaikatokiwifruit107.0x[15]
16christchurchaucklandkiwifruit1070.0x[16]

For the supply, we add a variable to each row, and then set the upper bound to the capacity of each supply node:

df_supply.x_supply = @variable(model, s[1:size(df_supply, 1)] >= 0)
set_upper_bound.(df_supply.x_supply, df_supply.capacity)
df_supply
4×5 DataFrame
Roworiginproductcapacitycostx_supply
StringStringFloat64Float64GenericV…
1waikatomilk10.00.5s[1]
2taurangamilk6.01.0s[2]
3taurangakiwifruit26.01.0s[3]
4christchurchmilk10.00.6s[4]

Our objective is to minimize the shipping cost plus the supply cost. To compute the flow cost, we need to join the shipping table, which contains distance_km with the products table, which contains cost_per_km:

df_cost = DataFrames.leftjoin(df_shipping, df_products; on = [:product])
df_cost.flow_cost = df_cost.cost_per_km .* df_cost.distance_km
df_cost
16×7 DataFrame
Roworigindestinationproductdistance_kmx_flowcost_per_kmflow_cost
StringStringStringFloat64GenericV…Float64?Float64
1aucklandwaikatomilk112.0x[1]0.0010.112
2aucklandtaurangamilk225.0x[2]0.0010.225
3aucklandchristchurchmilk1070.0x[3]0.0011.07
4waikatoaucklandmilk112.0x[4]0.0010.112
5waikatotaurangamilk107.0x[5]0.0010.107
6waikatowellingtonmilk392.0x[6]0.0010.392
7taurangaaucklandmilk225.0x[7]0.0010.225
8taurangawaikatomilk107.0x[8]0.0010.107
9christchurchaucklandmilk1070.0x[9]0.0011.07
10aucklandwaikatokiwifruit112.0x[10]0.011.12
11aucklandchristchurchkiwifruit1070.0x[11]0.0110.7
12waikatoaucklandkiwifruit112.0x[12]0.011.12
13waikatowellingtonkiwifruit392.0x[13]0.013.92
14taurangaaucklandkiwifruit225.0x[14]0.012.25
15taurangawaikatokiwifruit107.0x[15]0.011.07
16christchurchaucklandkiwifruit1070.0x[16]0.0110.7

Then we can use linear algebra to compute the inner product between two columns:

@objective(
    model,
    Min,
    df_cost.flow_cost' * df_shipping.x_flow +
    df_supply.cost' * df_supply.x_supply
);

For the flow capacities on each arc, we use DataFrames.groupby to partition the flow variables based on :origin and :destination, and then we constrain their sum to be less than a fixed capacity.

capacity = 30
for df in DataFrames.groupby(df_shipping, [:origin, :destination])
    @constraint(model, sum(df.x_flow) <= capacity)
end

For each node in the graph, we need to compute a mass balance constraint which says that for each product, the supply, plus the flow into the node, and less the flow out of the node is equal to the demand.

We can compute an expression for the flow out of each node using DataFrames.groupby on the origin and product columns of the df_shipping table:

df_flow_out = DataFrames.DataFrame(
    (node = i.origin, product = i.product, x_flow_out = sum(df.x_flow)) for
    (i, df) in pairs(DataFrames.groupby(df_shipping, [:origin, :product]))
)
8×3 DataFrame
Rownodeproductx_flow_out
StringStringAffExpr
1aucklandmilkx[1] + x[2] + x[3]
2waikatomilkx[4] + x[5] + x[6]
3taurangamilkx[7] + x[8]
4christchurchmilkx[9]
5aucklandkiwifruitx[10] + x[11]
6waikatokiwifruitx[12] + x[13]
7taurangakiwifruitx[14] + x[15]
8christchurchkiwifruitx[16]

We can compute an expression for the flow into each node using DataFrames.groupby on the destination and product columns of the df_shipping table:

df_flow_in = DataFrames.DataFrame(
    (node = i.destination, product = i.product, x_flow_in = sum(df.x_flow))
    for (i, df) in
    pairs(DataFrames.groupby(df_shipping, [:destination, :product]))
)
9×3 DataFrame
Rownodeproductx_flow_in
StringStringAffExpr
1waikatomilkx[1] + x[8]
2taurangamilkx[2] + x[5]
3christchurchmilkx[3]
4aucklandmilkx[4] + x[7] + x[9]
5wellingtonmilkx[6]
6waikatokiwifruitx[10] + x[15]
7christchurchkiwifruitx[11]
8aucklandkiwifruitx[12] + x[14] + x[16]
9wellingtonkiwifruitx[13]

We can join the two tables together using DataFrames.outerjoin. We need to use outerjoin here because there might be missing rows.

df = DataFrames.outerjoin(df_flow_in, df_flow_out; on = [:node, :product])
10×4 DataFrame
Rownodeproductx_flow_inx_flow_out
StringStringAffExpr?AffExpr?
1waikatomilkx[1] + x[8]x[4] + x[5] + x[6]
2taurangamilkx[2] + x[5]x[7] + x[8]
3christchurchmilkx[3]x[9]
4aucklandmilkx[4] + x[7] + x[9]x[1] + x[2] + x[3]
5waikatokiwifruitx[10] + x[15]x[12] + x[13]
6christchurchkiwifruitx[11]x[16]
7aucklandkiwifruitx[12] + x[14] + x[16]x[10] + x[11]
8wellingtonmilkx[6]missing
9wellingtonkiwifruitx[13]missing
10taurangakiwifruitmissingx[14] + x[15]

Next, we need to join the supply column:

df = DataFrames.leftjoin(
    df,
    DataFrames.select(df_supply, [:origin, :product, :x_supply]);
    on = [:node => :origin, :product],
)
10×5 DataFrame
Rownodeproductx_flow_inx_flow_outx_supply
StringStringAffExpr?AffExpr?GenericV…?
1waikatomilkx[1] + x[8]x[4] + x[5] + x[6]s[1]
2taurangamilkx[2] + x[5]x[7] + x[8]s[2]
3christchurchmilkx[3]x[9]s[4]
4taurangakiwifruitmissingx[14] + x[15]s[3]
5aucklandmilkx[4] + x[7] + x[9]x[1] + x[2] + x[3]missing
6waikatokiwifruitx[10] + x[15]x[12] + x[13]missing
7christchurchkiwifruitx[11]x[16]missing
8aucklandkiwifruitx[12] + x[14] + x[16]x[10] + x[11]missing
9wellingtonmilkx[6]missingmissing
10wellingtonkiwifruitx[13]missingmissing

and then the demand column

df = DataFrames.leftjoin(
    df,
    DataFrames.select(df_demand, [:destination, :product, :demand]);
    on = [:node => :destination, :product],
)
10×6 DataFrame
Rownodeproductx_flow_inx_flow_outx_supplydemand
StringStringAffExpr?AffExpr?GenericV…?Float64?
1taurangamilkx[2] + x[5]x[7] + x[8]s[2]2.0
2christchurchmilkx[3]x[9]s[4]4.0
3taurangakiwifruitmissingx[14] + x[15]s[3]2.0
4aucklandmilkx[4] + x[7] + x[9]x[1] + x[2] + x[3]missing16.0
5christchurchkiwifruitx[11]x[16]missing4.0
6aucklandkiwifruitx[12] + x[14] + x[16]x[10] + x[11]missing16.0
7wellingtonmilkx[6]missingmissing2.0
8wellingtonkiwifruitx[13]missingmissing2.0
9waikatomilkx[1] + x[8]x[4] + x[5] + x[6]s[1]missing
10waikatokiwifruitx[10] + x[15]x[12] + x[13]missingmissing

Now we're ready to add our mass balance constraint. Because some rows contain missing values, we need to use coalesce to convert any missing into a numeric value:

@constraint(
    model,
    [r in eachrow(df)],
    coalesce(r.x_supply, 0.0) + coalesce(r.x_flow_in, 0.0) -
    coalesce(r.x_flow_out, 0.0) == coalesce(r.demand, 0.0),
);

Solution

Finally, we can optimize the model:

optimize!(model)
Test.@test is_solved_and_feasible(model)
solution_summary(model)
* Solver : HiGHS

* Status
  Result count       : 1
  Termination status : OPTIMAL
  Message from the solver:
  "kHighsModelStatusOptimal"

* Candidate solution (result #1)
  Primal status      : FEASIBLE_POINT
  Dual status        : FEASIBLE_POINT
  Objective value    : 1.43228e+02
  Objective bound    : 1.43228e+02
  Relative gap       : 3.96874e-16
  Dual objective value : 1.43228e+02

* Work counters
  Solve time (sec)   : 3.10421e-04
  Simplex iterations : 8
  Barrier iterations : 0
  Node count         : -1

update the solution in the DataFrames:

df_shipping.x_flow = value.(df_shipping.x_flow)
df_supply.x_supply = value.(df_supply.x_supply);

and display the optimal solution for flows:

DataFrames.select(
    filter!(row -> row.x_flow > 0.0, df_shipping),
    [:origin, :destination, :product, :x_flow],
)
9×4 DataFrame
Roworigindestinationproductx_flow
StringStringStringFloat64
1waikatoaucklandmilk10.0
2waikatowellingtonmilk2.0
3taurangaaucklandmilk2.0
4taurangawaikatomilk2.0
5christchurchaucklandmilk4.0
6aucklandchristchurchkiwifruit4.0
7waikatoaucklandkiwifruit20.0
8waikatowellingtonkiwifruit2.0
9taurangawaikatokiwifruit22.0