Progressive Hedging
This tutorial was generated using Literate.jl. Download the source as a .jl file.
The purpose of this tutorial is to demonstrate the Progressive Hedging algorithm. It may be helpful to read Two-stage stochastic programs first.
Required packages
This tutorial requires the following packages:
using JuMP
import Distributions
import Ipopt
import PrintfBackground
Progressive Hedging (PH) is a popular decomposition algorithm for stochastic programming. It decomposes a stochastic problem into scenario subproblems that are solved iteratively, with penalty terms driving solutions toward consensus.
In Progressive Hedging, each scenario subproblem includes a quadratic penalty term:
\[\min\limits_{x_s}: f_s(x_s) + \frac{\rho}{2} ||x_s - \bar{x}||^2 + w_s^\top x_s\]
where:
- $x_s$ is the primal variable in scenario $s$
- $f_s(x)$ is the original scenario objective
- $\rho$ is the penalty parameter
- $\bar{x}$ is the current consensus (average) solution
- $w_s$ is the dual price (Lagrangian multiplier) in scenario $s$
Progressive Hedging is an iterative algorithm. In each iteration, it solves all the penalized scenario subproblems, then it applies two updates:
- $\bar{x} = \mathbb{E}_s[x_s]$
- $w_s = w_s + \rho (x_s - \bar{x})$
The algorithm terminates if $|\bar{x} - x_s| \le \varepsilon$ for all scenarios (the primal residual), and $\bar{x}$ has not changed by much between iterations (the dual residual).
$\rho$ can be optionally updated between iterations. How to do so is an open question. There is a large literature on different updates strategies.
In this tutorial we use parameters for $\rho$, $w$, and $\bar{x}$ to efficiently modify each scenario's subproblem between PH iterations.
Building a single scenario
The building block of Progressive Hedging is a separate JuMP model for each scenario. Here's an example, using the problem from Two-stage stochastic programs:
function build_subproblem(; demand::Float64)
model = Model(Ipopt.Optimizer)
set_silent(model)
@variable(model, x >= 0)
@variable(model, 0 <= y <= demand)
@constraint(model, y <= x)
@variable(model, ρ in Parameter(1))
@variable(model, x̄ in Parameter(0))
@variable(model, w in Parameter(0))
@expression(model, f_s, 2 * x - 5 * y + 0.1 * (x - y))
@objective(model, Min, f_s + ρ / 2 * (x - x̄)^2 + w * x)
return model
endbuild_subproblem (generic function with 1 method)Using the build_subproblem function, we can create one JuMP model for each scenario:
N = 10
demands = rand(Distributions.TriangularDist(150.0, 250.0, 200.0), N);
subproblems = map(demands) do demand
return (; model = build_subproblem(; demand), probability = 1 / N)
end;The Progressive Hedging loop
We're almost ready for our optimization loop, but first, here's a helpful function for logging:
function print_iteration(iter, args...)
if mod(iter, 10) == 0
f(x) = Printf.@sprintf("%15.4e", x)
println(lpad(iter, 9), " ", join(f.(args), " "))
end
return
endprint_iteration (generic function with 1 method)Now we can implement our algorithm:
function solve_progressive_hedging(
subproblems;
iteration_limit::Int = 400,
atol::Float64 = 1e-4,
ρ::Float64 = 1.0,
)
x̄_old, x̄ = 0.0, 0.0
x, w = zeros(length(subproblems)), zeros(length(subproblems))
println("iteration primal_residual dual_residual")
# For each iteration...
for iter in 1:iteration_limit
# For each subproblem...
for (i, data) in enumerate(subproblems)
# Update the parameters
set_parameter_value(data.model[:ρ], ρ)
set_parameter_value(data.model[:x̄], x̄)
set_parameter_value(data.model[:w], w[i])
# Solve the subproblem
optimize!(data.model)
assert_is_solved_and_feasible(data.model)
# Store the primal solution
x[i] = value(data.model[:x])
end
# Compute the consensus solution for the first-stage variables
x̄ = sum(s.probability * x_s for (s, x_s) in zip(subproblems, x))
# Compute the primal and dual residuals
primal_residual = maximum(abs, x_s - x̄ for x_s in x)
dual_residual = ρ * abs(x̄ - x̄_old)
print_iteration(iter, primal_residual, dual_residual)
# Check for convergence
if primal_residual < atol && dual_residual < atol
break
end
# Update
x̄_old = x̄
w .+= ρ .* (x .- x̄)
end
return x̄
end
x̄ = solve_progressive_hedging(subproblems);iteration primal_residual dual_residual
10 8.1712e-14 3.0000e+00
20 1.7053e-13 3.0000e+00
30 3.6948e-13 3.0000e+00
40 1.6342e-12 3.0000e+00
50 1.2193e-11 3.0000e+00
60 2.4386e+00 1.9800e+00
70 2.7180e-01 9.6000e-01
80 3.8742e-01 4.0695e-01
90 5.4834e-10 6.0000e-02
100 2.0456e-08 6.0000e-02
110 3.7655e-02 3.5354e-02
120 1.7421e-02 2.1378e-02
130 7.3852e-03 1.2847e-02
140 2.6223e-03 7.6746e-03
150 5.1293e-04 4.5579e-03
160 3.1033e-04 2.6913e-03
170 5.4429e-04 1.5800e-03
180 5.3274e-04 9.2214e-04
190 4.3756e-04 5.3500e-04
200 3.2951e-04 3.0847e-04
210 2.3545e-04 1.7670e-04
220 1.6237e-04 1.0053e-04The consensus first-stage decision is:
x̄199.0645508643155Progressive Hedging with an adaptive penalty parameter
You can also make the penalty parameter $\rho$ adaptive. How to do so is an open question. There is a large literature on different updates strategies. One approach is to increase $\rho$ if the primal residual is much larger than the dual residual, and to decrease $\rho$ if the dual residual is much larger than the primal residual.
function solve_adaptive_progressive_hedging(
subproblems;
iteration_limit::Int = 400,
atol::Float64 = 1e-4,
ρ::Float64 = 1.0,
τ::Float64 = 1.3,
μ::Float64 = 15.0,
)
x̄_old, x̄ = 0.0, 0.0
x, w = zeros(length(subproblems)), zeros(length(subproblems))
println("iteration primal_residual dual_residual")
for iter in 1:iteration_limit
for (i, data) in enumerate(subproblems)
set_parameter_value(data.model[:ρ], ρ)
set_parameter_value(data.model[:x̄], x̄)
set_parameter_value(data.model[:w], w[i])
optimize!(data.model)
assert_is_solved_and_feasible(data.model)
x[i] = value(data.model[:x])
end
x̄ = sum(s.probability * x_s for (s, x_s) in zip(subproblems, x))
primal_residual = maximum(abs, x_s - x̄ for x_s in x)
dual_residual = ρ * abs(x̄ - x̄_old)
print_iteration(iter, primal_residual, dual_residual)
if primal_residual < atol && dual_residual < atol
break
end
w .+= ρ .* (x .- x̄)
x̄_old = x̄
# Adaptive ρ update
if primal_residual > μ * dual_residual
ρ *= τ
elseif dual_residual > μ * primal_residual
ρ /= τ
end
end
return x̄
end
x̄ = solve_adaptive_progressive_hedging(subproblems);iteration primal_residual dual_residual
10 1.0348e-10 3.0000e+00
20 6.7467e-01 6.0000e-02
30 1.1656e+00 1.3010e-02
40 6.5131e-01 1.6583e-02
50 5.1318e-01 1.5611e-02
60 3.8045e-01 1.6417e-02
70 2.3628e-01 1.2217e-02
80 1.4027e-01 7.1245e-03
90 8.3327e-02 1.4727e-03
100 6.3937e-02 1.1929e-03
110 3.7550e-02 2.3153e-03
120 2.1968e-02 2.5947e-03
130 1.2693e-02 3.1577e-03
140 9.5116e-03 3.1448e-03
150 7.1630e-03 2.7306e-03
160 4.0891e-03 1.9131e-03
170 2.3176e-03 1.3008e-03
180 1.2957e-03 1.1368e-03
190 9.4812e-04 9.4751e-04
200 6.9116e-04 5.9868e-04
210 3.7879e-04 3.8196e-04
220 2.0506e-04 2.4104e-04
230 1.4085e-04 1.9668e-04
240 1.0011e-04 1.5558e-04The consensus first-stage decision is:
x̄199.06455693884953Try tuning the values of τ and μ. Can you get the algorithm to converge in fewer iterations?