The network multi-commodity flow problem
This tutorial was generated using Literate.jl. Download the source as a .jl
file.
This tutorial is a variation of The multi-commodity flow problem where the graph is a network instead of a bipartite graph.
The purpose of this tutorial is to demonstrate a style of modeling that uses relational algebra.
Required packages
This tutorial uses the following packages:
using JuMP
import DataFrames
import HiGHS
import SQLite
import SQLite.DBInterface
import Test
Formulation
The network multi-commondity flow problem is an extension of the The multi-commodity flow problem, where instead of having a bipartite graph of supply and demand nodes, the graph can contains a set of nodes, $i \in \mathcal{N}$ , which each have a (potentially zero) supply capacity, $u^s_{i,p}$, and (potentially zero) a demand, $d_{i,p}$ for each commodity $p \in P$. The nodes are connected by a set of edges $(i, j) \in \mathcal{E}$, which have a shipment cost $c^x_{i,j,p}$ and a total flow capacity of $u^x_{i,j}$.
Our take is to choose an optimal supply for each node $s_{i,p}$, as well as the optimal transshipment $x_{i,j,p}$ that minimizes the total cost.
The mathematical formulation is:
\[\begin{aligned} \min \;\; & \sum_{(i,j)\in\mathcal{E}, p \in P} c^x_{i,j,p} x_{i,j,p} + \sum_{i\in\mathcal{N}, p \in P} c^s_{i,p} s_{i,p} \\ s.t. \;\; & s_{i,p} + \sum_{(j, i) \in \mathcal{E}} x_{j,i,p} - \sum_{(i,j) \in \mathcal{E}} x_{i,j,p} = d_{i,p} & \forall i \in \mathcal{N}, p \in P \\ & x_{i,j,p} \ge 0 & \forall (i, j) \in \mathcal{E}, p \in P \\ & \sum_{p \in P} x_{i,j,p} \le u^x_{i,j} & \forall (i, j) \in \mathcal{E} \\ & 0 \le s_{i,p} \le u^s_{i,p} & \forall i \in \mathcal{N}, p \in P. \end{aligned}\]
The purpose of this tutorial is to demonstrate how this model can be built using relational algebra instead of a direct math-to-code translation of the summations.
Data
For the purpose of this tutorial, the JuMP repository contains an example database called commodity_nz.db
:
filename = joinpath(@__DIR__, "commodity_nz.db");
To run locally, download commodity_nz.db
and update filename
appropriately.
Load the database using SQLite.DB
:
db = SQLite.DB(filename)
SQLite.DB("/home/runner/work/JuMP.jl/JuMP.jl/docs/build/tutorials/linear/commodity_nz.db")
A quick way to see the schema of the database is via SQLite.tables
:
SQLite.tables(db)
4-element Vector{SQLite.DBTable}:
SQLite.DBTable("products", Tables.Schema:
:product Union{Missing, String}
:cost_per_km Union{Missing, Float64})
SQLite.DBTable("shipping", Tables.Schema:
:origin Union{Missing, String}
:destination Union{Missing, String}
:product Union{Missing, String}
:distance_km Union{Missing, Float64})
SQLite.DBTable("supply", Tables.Schema:
:origin Union{Missing, String}
:product Union{Missing, String}
:capacity Union{Missing, Float64}
:cost Union{Missing, Float64})
SQLite.DBTable("demand", Tables.Schema:
:destination Union{Missing, String}
:product Union{Missing, String}
:demand Union{Missing, Float64})
We interact with the database by executing queries and then loading the results into a DataFrame:
function get_table(db, table)
query = DBInterface.execute(db, "SELECT * FROM $table")
return DataFrames.DataFrame(query)
end
get_table (generic function with 1 method)
The shipping
table contains the set of arcs and their distances:
df_shipping = get_table(db, "shipping")
Row | origin | destination | product | distance_km |
---|---|---|---|---|
String | String | String | Float64 | |
1 | auckland | waikato | milk | 112.0 |
2 | auckland | tauranga | milk | 225.0 |
3 | auckland | christchurch | milk | 1070.0 |
4 | waikato | auckland | milk | 112.0 |
5 | waikato | tauranga | milk | 107.0 |
6 | waikato | wellington | milk | 392.0 |
7 | tauranga | auckland | milk | 225.0 |
8 | tauranga | waikato | milk | 107.0 |
9 | christchurch | auckland | milk | 1070.0 |
10 | auckland | waikato | kiwifruit | 112.0 |
11 | auckland | christchurch | kiwifruit | 1070.0 |
12 | waikato | auckland | kiwifruit | 112.0 |
13 | waikato | wellington | kiwifruit | 392.0 |
14 | tauranga | auckland | kiwifruit | 225.0 |
15 | tauranga | waikato | kiwifruit | 107.0 |
16 | christchurch | auckland | kiwifruit | 1070.0 |
The products
table contains the shipping cost per kilometer of each product:
df_products = get_table(db, "products")
Row | product | cost_per_km |
---|---|---|
String | Float64 | |
1 | milk | 0.001 |
2 | kiwifruit | 0.01 |
The supply
table contains the supply capacity of each node, as well as the cost:
df_supply = get_table(db, "supply")
Row | origin | product | capacity | cost |
---|---|---|---|---|
String | String | Float64 | Float64 | |
1 | waikato | milk | 10.0 | 0.5 |
2 | tauranga | milk | 6.0 | 1.0 |
3 | tauranga | kiwifruit | 26.0 | 1.0 |
4 | christchurch | milk | 10.0 | 0.6 |
The demand
table contains the demand of each node:
df_demand = get_table(db, "demand")
Row | destination | product | demand |
---|---|---|---|
String | String | Float64 | |
1 | auckland | milk | 16.0 |
2 | auckland | kiwifruit | 16.0 |
3 | tauranga | milk | 2.0 |
4 | tauranga | kiwifruit | 2.0 |
5 | wellington | milk | 2.0 |
6 | wellington | kiwifruit | 2.0 |
7 | christchurch | milk | 4.0 |
8 | christchurch | kiwifruit | 4.0 |
JuMP formulation
We start by creating a model and our decision variables:
model = Model(HiGHS.Optimizer)
set_silent(model)
For the shipping decisions, we create a new column in df_shipping
called x_flow
, which has one non-negative decision variable for each row:
df_shipping.x_flow = @variable(model, x[1:size(df_shipping, 1)] >= 0)
df_shipping
Row | origin | destination | product | distance_km | x_flow |
---|---|---|---|---|---|
String | String | String | Float64 | GenericV… | |
1 | auckland | waikato | milk | 112.0 | x[1] |
2 | auckland | tauranga | milk | 225.0 | x[2] |
3 | auckland | christchurch | milk | 1070.0 | x[3] |
4 | waikato | auckland | milk | 112.0 | x[4] |
5 | waikato | tauranga | milk | 107.0 | x[5] |
6 | waikato | wellington | milk | 392.0 | x[6] |
7 | tauranga | auckland | milk | 225.0 | x[7] |
8 | tauranga | waikato | milk | 107.0 | x[8] |
9 | christchurch | auckland | milk | 1070.0 | x[9] |
10 | auckland | waikato | kiwifruit | 112.0 | x[10] |
11 | auckland | christchurch | kiwifruit | 1070.0 | x[11] |
12 | waikato | auckland | kiwifruit | 112.0 | x[12] |
13 | waikato | wellington | kiwifruit | 392.0 | x[13] |
14 | tauranga | auckland | kiwifruit | 225.0 | x[14] |
15 | tauranga | waikato | kiwifruit | 107.0 | x[15] |
16 | christchurch | auckland | kiwifruit | 1070.0 | x[16] |
For the supply, we add a variable to each row, and then set the upper bound to the capacity of each supply node:
df_supply.x_supply = @variable(model, s[1:size(df_supply, 1)] >= 0)
set_upper_bound.(df_supply.x_supply, df_supply.capacity)
df_supply
Row | origin | product | capacity | cost | x_supply |
---|---|---|---|---|---|
String | String | Float64 | Float64 | GenericV… | |
1 | waikato | milk | 10.0 | 0.5 | s[1] |
2 | tauranga | milk | 6.0 | 1.0 | s[2] |
3 | tauranga | kiwifruit | 26.0 | 1.0 | s[3] |
4 | christchurch | milk | 10.0 | 0.6 | s[4] |
Our objective is to minimize the shipping cost plus the supply cost. To compute the flow cost, we need to join the shipping table, which contains distance_km
with the products table, which contains cost_per_km
:
df_cost = DataFrames.leftjoin(df_shipping, df_products; on = [:product])
df_cost.flow_cost = df_cost.cost_per_km .* df_cost.distance_km
df_cost
Row | origin | destination | product | distance_km | x_flow | cost_per_km | flow_cost |
---|---|---|---|---|---|---|---|
String | String | String | Float64 | GenericV… | Float64? | Float64 | |
1 | auckland | waikato | milk | 112.0 | x[1] | 0.001 | 0.112 |
2 | auckland | tauranga | milk | 225.0 | x[2] | 0.001 | 0.225 |
3 | auckland | christchurch | milk | 1070.0 | x[3] | 0.001 | 1.07 |
4 | waikato | auckland | milk | 112.0 | x[4] | 0.001 | 0.112 |
5 | waikato | tauranga | milk | 107.0 | x[5] | 0.001 | 0.107 |
6 | waikato | wellington | milk | 392.0 | x[6] | 0.001 | 0.392 |
7 | tauranga | auckland | milk | 225.0 | x[7] | 0.001 | 0.225 |
8 | tauranga | waikato | milk | 107.0 | x[8] | 0.001 | 0.107 |
9 | christchurch | auckland | milk | 1070.0 | x[9] | 0.001 | 1.07 |
10 | auckland | waikato | kiwifruit | 112.0 | x[10] | 0.01 | 1.12 |
11 | auckland | christchurch | kiwifruit | 1070.0 | x[11] | 0.01 | 10.7 |
12 | waikato | auckland | kiwifruit | 112.0 | x[12] | 0.01 | 1.12 |
13 | waikato | wellington | kiwifruit | 392.0 | x[13] | 0.01 | 3.92 |
14 | tauranga | auckland | kiwifruit | 225.0 | x[14] | 0.01 | 2.25 |
15 | tauranga | waikato | kiwifruit | 107.0 | x[15] | 0.01 | 1.07 |
16 | christchurch | auckland | kiwifruit | 1070.0 | x[16] | 0.01 | 10.7 |
Then we can use linear algebra to compute the inner product between two columns:
@objective(
model,
Min,
df_cost.flow_cost' * df_shipping.x_flow +
df_supply.cost' * df_supply.x_supply
);
For the flow capacities on each arc, we use DataFrames.groupby
to partition the flow variables based on :origin
and :destination
, and then we constrain their sum to be less than a fixed capacity.
capacity = 30
for df in DataFrames.groupby(df_shipping, [:origin, :destination])
@constraint(model, sum(df.x_flow) <= capacity)
end
For each node in the graph, we need to compute a mass balance constraint which says that for each product, the supply, plus the flow into the node, and less the flow out of the node is equal to the demand.
We can compute an expression for the flow out of each node using DataFrames.groupby
on the origin
and product
columns of the df_shipping
table:
df_flow_out = DataFrames.DataFrame(
(node = i.origin, product = i.product, x_flow_out = sum(df.x_flow)) for
(i, df) in pairs(DataFrames.groupby(df_shipping, [:origin, :product]))
)
Row | node | product | x_flow_out |
---|---|---|---|
String | String | AffExpr | |
1 | auckland | milk | x[1] + x[2] + x[3] |
2 | waikato | milk | x[4] + x[5] + x[6] |
3 | tauranga | milk | x[7] + x[8] |
4 | christchurch | milk | x[9] |
5 | auckland | kiwifruit | x[10] + x[11] |
6 | waikato | kiwifruit | x[12] + x[13] |
7 | tauranga | kiwifruit | x[14] + x[15] |
8 | christchurch | kiwifruit | x[16] |
We can compute an expression for the flow into each node using DataFrames.groupby
on the destination
and product
columns of the df_shipping
table:
df_flow_in = DataFrames.DataFrame(
(node = i.destination, product = i.product, x_flow_in = sum(df.x_flow))
for (i, df) in
pairs(DataFrames.groupby(df_shipping, [:destination, :product]))
)
Row | node | product | x_flow_in |
---|---|---|---|
String | String | AffExpr | |
1 | waikato | milk | x[1] + x[8] |
2 | tauranga | milk | x[2] + x[5] |
3 | christchurch | milk | x[3] |
4 | auckland | milk | x[4] + x[7] + x[9] |
5 | wellington | milk | x[6] |
6 | waikato | kiwifruit | x[10] + x[15] |
7 | christchurch | kiwifruit | x[11] |
8 | auckland | kiwifruit | x[12] + x[14] + x[16] |
9 | wellington | kiwifruit | x[13] |
We can join the two tables together using DataFrames.outerjoin
. We need to use outerjoin
here because there might be missing rows.
df = DataFrames.outerjoin(df_flow_in, df_flow_out; on = [:node, :product])
Row | node | product | x_flow_in | x_flow_out |
---|---|---|---|---|
String | String | AffExpr? | AffExpr? | |
1 | waikato | milk | x[1] + x[8] | x[4] + x[5] + x[6] |
2 | tauranga | milk | x[2] + x[5] | x[7] + x[8] |
3 | christchurch | milk | x[3] | x[9] |
4 | auckland | milk | x[4] + x[7] + x[9] | x[1] + x[2] + x[3] |
5 | waikato | kiwifruit | x[10] + x[15] | x[12] + x[13] |
6 | christchurch | kiwifruit | x[11] | x[16] |
7 | auckland | kiwifruit | x[12] + x[14] + x[16] | x[10] + x[11] |
8 | wellington | milk | x[6] | missing |
9 | wellington | kiwifruit | x[13] | missing |
10 | tauranga | kiwifruit | missing | x[14] + x[15] |
Next, we need to join the supply column:
df = DataFrames.leftjoin(
df,
DataFrames.select(df_supply, [:origin, :product, :x_supply]);
on = [:node => :origin, :product],
)
Row | node | product | x_flow_in | x_flow_out | x_supply |
---|---|---|---|---|---|
String | String | AffExpr? | AffExpr? | GenericV…? | |
1 | waikato | milk | x[1] + x[8] | x[4] + x[5] + x[6] | s[1] |
2 | tauranga | milk | x[2] + x[5] | x[7] + x[8] | s[2] |
3 | christchurch | milk | x[3] | x[9] | s[4] |
4 | tauranga | kiwifruit | missing | x[14] + x[15] | s[3] |
5 | auckland | milk | x[4] + x[7] + x[9] | x[1] + x[2] + x[3] | missing |
6 | waikato | kiwifruit | x[10] + x[15] | x[12] + x[13] | missing |
7 | christchurch | kiwifruit | x[11] | x[16] | missing |
8 | auckland | kiwifruit | x[12] + x[14] + x[16] | x[10] + x[11] | missing |
9 | wellington | milk | x[6] | missing | missing |
10 | wellington | kiwifruit | x[13] | missing | missing |
and then the demand column
df = DataFrames.leftjoin(
df,
DataFrames.select(df_demand, [:destination, :product, :demand]);
on = [:node => :destination, :product],
)
Row | node | product | x_flow_in | x_flow_out | x_supply | demand |
---|---|---|---|---|---|---|
String | String | AffExpr? | AffExpr? | GenericV…? | Float64? | |
1 | tauranga | milk | x[2] + x[5] | x[7] + x[8] | s[2] | 2.0 |
2 | christchurch | milk | x[3] | x[9] | s[4] | 4.0 |
3 | tauranga | kiwifruit | missing | x[14] + x[15] | s[3] | 2.0 |
4 | auckland | milk | x[4] + x[7] + x[9] | x[1] + x[2] + x[3] | missing | 16.0 |
5 | christchurch | kiwifruit | x[11] | x[16] | missing | 4.0 |
6 | auckland | kiwifruit | x[12] + x[14] + x[16] | x[10] + x[11] | missing | 16.0 |
7 | wellington | milk | x[6] | missing | missing | 2.0 |
8 | wellington | kiwifruit | x[13] | missing | missing | 2.0 |
9 | waikato | milk | x[1] + x[8] | x[4] + x[5] + x[6] | s[1] | missing |
10 | waikato | kiwifruit | x[10] + x[15] | x[12] + x[13] | missing | missing |
Now we're ready to add our mass balance constraint. Because some rows contain missing
values, we need to use coalesce
to convert any missing
into a numeric value:
@constraint(
model,
[r in eachrow(df)],
coalesce(r.x_supply, 0.0) + coalesce(r.x_flow_in, 0.0) -
coalesce(r.x_flow_out, 0.0) == coalesce(r.demand, 0.0),
);
Solution
Finally, we can optimize the model:
optimize!(model)
Test.@test is_solved_and_feasible(model)
solution_summary(model)
* Solver : HiGHS
* Status
Result count : 1
Termination status : OPTIMAL
Message from the solver:
"kHighsModelStatusOptimal"
* Candidate solution (result #1)
Primal status : FEASIBLE_POINT
Dual status : FEASIBLE_POINT
Objective value : 1.43228e+02
Objective bound : 1.43228e+02
Relative gap : 3.96874e-16
Dual objective value : 1.43228e+02
* Work counters
Solve time (sec) : 3.08990e-04
Simplex iterations : 8
Barrier iterations : 0
Node count : -1
update the solution in the DataFrames:
df_shipping.x_flow = value.(df_shipping.x_flow)
df_supply.x_supply = value.(df_supply.x_supply);
and display the optimal solution for flows:
DataFrames.select(
filter!(row -> row.x_flow > 0.0, df_shipping),
[:origin, :destination, :product, :x_flow],
)
Row | origin | destination | product | x_flow |
---|---|---|---|---|
String | String | String | Float64 | |
1 | waikato | auckland | milk | 10.0 |
2 | waikato | wellington | milk | 2.0 |
3 | tauranga | auckland | milk | 2.0 |
4 | tauranga | waikato | milk | 2.0 |
5 | christchurch | auckland | milk | 4.0 |
6 | auckland | christchurch | kiwifruit | 4.0 |
7 | waikato | auckland | kiwifruit | 20.0 |
8 | waikato | wellington | kiwifruit | 2.0 |
9 | tauranga | waikato | kiwifruit | 22.0 |