Progressive Hedging
This tutorial was generated using Literate.jl. Download the source as a .jl file.
The purpose of this tutorial is to demonstrate the Progressive Hedging algorithm. It may be helpful to read Two-stage stochastic programs first.
Required packages
This tutorial requires the following packages:
using JuMP
import Distributions
import Ipopt
import PrintfBackground
Progressive Hedging (PH) is a popular decomposition algorithm for stochastic programming. It decomposes a stochastic problem into scenario subproblems that are solved iteratively, with penalty terms driving solutions toward consensus.
In Progressive Hedging, each scenario subproblem includes a quadratic penalty term:
\[\min\limits_{x_s}: f_s(x_s) + \frac{\rho}{2} ||x_s - \bar{x}||^2 + w_s^\top x_s\]
where:
- $x_s$ is the primal variable in scenario $s$
- $f_s(x)$ is the original scenario objective
- $\rho$ is the penalty parameter
- $\bar{x}$ is the current consensus (average) solution
- $w_s$ is the dual price (Lagrangian multiplier) in scenario $s$
Progressive Hedging is an iterative algorithm. In each iteration, it solves all the penalized scenario subproblems, then it applies two updates:
- $\bar{x} = \mathbb{E}_s[x_s]$
- $w_s = w_s + \rho (x_s - \bar{x})$
The algorithm terminates if $|\bar{x} - x_s| \le \varepsilon$ for all scenarios (the primal residual), and $\bar{x}$ has not changed by much between iterations (the dual residual).
$\rho$ can be optionally updated between iterations. How to do so is an open question. There is a large literature on different updates strategies.
In this tutorial we use parameters for $\rho$, $w$, and $\bar{x}$ to efficiently modify each scenario's subproblem between PH iterations.
Building a single scenario
The building block of Progressive Hedging is a separate JuMP model for each scenario. Here's an example, using the problem from Two-stage stochastic programs:
function build_subproblem(; demand::Float64)
model = Model(Ipopt.Optimizer)
set_silent(model)
@variable(model, x >= 0)
@variable(model, 0 <= y <= demand)
@constraint(model, y <= x)
@variable(model, ρ in Parameter(1))
@variable(model, x̄ in Parameter(0))
@variable(model, w in Parameter(0))
@expression(model, f_s, 2 * x - 5 * y + 0.1 * (x - y))
@objective(model, Min, f_s + ρ / 2 * (x - x̄)^2 + w * x)
return model
endbuild_subproblem (generic function with 1 method)Using the build_subproblem function, we can create one JuMP model for each scenario:
N = 10
demands = rand(Distributions.TriangularDist(150.0, 250.0, 200.0), N);
subproblems = map(demands) do demand
return (; model = build_subproblem(; demand), probability = 1 / N)
end;The Progressive Hedging loop
We're almost ready for our optimization loop, but first, here's a helpful function for logging:
function print_iteration(iter, args...)
if mod(iter, 10) == 0
f(x) = Printf.@sprintf("%15.4e", x)
println(lpad(iter, 9), " ", join(f.(args), " "))
end
return
endprint_iteration (generic function with 1 method)Now we can implement our algorithm:
function solve_progressive_hedging(
subproblems;
iteration_limit::Int = 400,
atol::Float64 = 1e-4,
ρ::Float64 = 1.0,
)
x̄_old, x̄ = 0.0, 0.0
x, w = zeros(length(subproblems)), zeros(length(subproblems))
println("iteration primal_residual dual_residual")
# For each iteration...
for iter in 1:iteration_limit
# For each subproblem...
for (i, data) in enumerate(subproblems)
# Update the parameters
set_parameter_value(data.model[:ρ], ρ)
set_parameter_value(data.model[:x̄], x̄)
set_parameter_value(data.model[:w], w[i])
# Solve the subproblem
optimize!(data.model)
assert_is_solved_and_feasible(data.model)
# Store the primal solution
x[i] = value(data.model[:x])
end
# Compute the consensus solution for the first-stage variables
x̄ = sum(s.probability * x_s for (s, x_s) in zip(subproblems, x))
# Compute the primal and dual residuals
primal_residual = maximum(abs, x_s - x̄ for x_s in x)
dual_residual = ρ * abs(x̄ - x̄_old)
print_iteration(iter, primal_residual, dual_residual)
# Check for convergence
if primal_residual < atol && dual_residual < atol
break
end
# Update
x̄_old = x̄
w .+= ρ .* (x .- x̄)
end
return x̄
end
x̄ = solve_progressive_hedging(subproblems);iteration primal_residual dual_residual
10 6.0396e-14 3.0000e+00
20 1.1369e-13 3.0000e+00
30 2.1316e-13 3.0000e+00
40 8.3844e-13 3.0000e+00
50 3.4959e-12 3.0000e+00
60 5.6701e-11 3.0000e+00
70 1.3299e+00 1.9475e+00
80 1.8384e+00 4.5000e-01
90 3.2751e-01 6.5166e-02
100 9.1352e-02 4.5079e-02
110 6.9887e-02 9.2377e-03
120 4.8401e-02 4.3812e-03
130 4.4359e-02 1.2097e-03
140 9.1980e-03 9.2719e-03
150 6.6613e-03 5.3219e-03
160 4.6371e-03 3.0341e-03
170 3.1379e-03 1.7172e-03
180 2.0783e-03 9.6423e-04
190 1.3532e-03 5.3670e-04
200 8.6885e-04 2.9580e-04
210 5.5131e-04 1.6119e-04
220 3.4626e-04 8.6680e-05The consensus first-stage decision is:
x̄217.74810386692195Progressive Hedging with an adaptive penalty parameter
You can also make the penalty parameter $\rho$ adaptive. How to do so is an open question. There is a large literature on different updates strategies. One approach is to increase $\rho$ if the primal residual is much larger than the dual residual, and to decrease $\rho$ if the dual residual is much larger than the primal residual.
function solve_adaptive_progressive_hedging(
subproblems;
iteration_limit::Int = 400,
atol::Float64 = 1e-4,
ρ::Float64 = 1.0,
τ::Float64 = 1.3,
μ::Float64 = 15.0,
)
x̄_old, x̄ = 0.0, 0.0
x, w = zeros(length(subproblems)), zeros(length(subproblems))
println("iteration primal_residual dual_residual")
for iter in 1:iteration_limit
for (i, data) in enumerate(subproblems)
set_parameter_value(data.model[:ρ], ρ)
set_parameter_value(data.model[:x̄], x̄)
set_parameter_value(data.model[:w], w[i])
optimize!(data.model)
assert_is_solved_and_feasible(data.model)
x[i] = value(data.model[:x])
end
x̄ = sum(s.probability * x_s for (s, x_s) in zip(subproblems, x))
primal_residual = maximum(abs, x_s - x̄ for x_s in x)
dual_residual = ρ * abs(x̄ - x̄_old)
print_iteration(iter, primal_residual, dual_residual)
if primal_residual < atol && dual_residual < atol
break
end
w .+= ρ .* (x .- x̄)
x̄_old = x̄
# Adaptive ρ update
if primal_residual > μ * dual_residual
ρ *= τ
elseif dual_residual > μ * primal_residual
ρ /= τ
end
end
return x̄
end
x̄ = solve_adaptive_progressive_hedging(subproblems);iteration primal_residual dual_residual
10 5.1202e-11 3.0000e+00
20 5.8061e-01 7.8260e-02
30 2.2247e-01 5.0624e-02
40 9.3982e-02 2.4035e-02
50 6.3957e-02 9.4033e-03
60 4.5687e-02 8.1923e-03
70 1.0570e-03 3.6890e-02
80 1.0565e-03 2.8298e-02
90 1.9532e-03 1.6617e-02
100 1.9316e-03 9.7003e-03
110 2.1736e-03 5.6042e-03
120 2.0751e-03 4.2051e-03
130 1.4299e-03 3.1477e-03
140 9.9005e-04 1.7924e-03
150 6.6724e-04 1.0131e-03
160 5.8005e-04 7.3284e-04
170 4.7402e-04 5.3957e-04
180 3.0447e-04 2.9748e-04
190 1.9326e-04 1.6217e-04
200 1.5894e-04 8.6144e-05
210 1.2752e-04 6.0165e-05The consensus first-stage decision is:
x̄217.74799866650977Try tuning the values of τ and μ. Can you get the algorithm to converge in fewer iterations?