Sensitivity Analysis of Ridge Regression
This example illustrates the sensitivity analysis of data points in a Ridge Regression problem. The general form of the problem is given below:
\[\begin{split} \begin{array} {ll} \mbox{minimize} & \sum_{i=1}^{N} (y_{i} - w x_{i} - b)^2 + \alpha (w^2 + b^2) \\ \end{array} \end{split}\]
where
w
,b
are slope and intercept of the regressing linex
,y
are the N data pointsα
is the regularization constant
which is equivalent to:
\[\begin{split} \begin{array} {ll} \mbox{minimize} & e^{\top}e + \alpha (w^2) \\ \mbox{s.t.} & e_{i} = y_{i} - w x_{i} - b \quad \quad i=1..N \\ \end{array} \end{split}\]
This tutorial uses the following packages
using JuMP
import DiffOpt
import Random
import Ipopt
import Plots
using LinearAlgebra: dot
Define and solve the problem
Construct a set of noisy (guassian) data points around a line.
Random.seed!(42)
N = 150
w_orig = 2 * abs(randn())
b = rand()
X = randn(N)
Y = w_orig * X .+ b + 0.8 * randn(N);
The helper method fit_ridge
defines and solves the corresponding model. The ridge regression is modeled with quadratic programming (quadratic objective and linear constraints) and solved in generic methods of Ipopt. This is not the standard way of solving the ridge regression problem this is done here for didactic purposes.
function build_fit_ridge(X_data, Y_data, alpha = 0.1)
N = length(Y_data)
# Initialize a JuMP Model with Ipopt solver
# model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer)) # TODO: this is not auto detecting scalar nonlinear function (alpha * w * w)
model = DiffOpt.nonlinear_diff_model(Ipopt.Optimizer)
set_silent(model)
@variable(model, w) # angular coefficient
@variable(model, b) # linear coefficient
@variable(model, α in Parameter(alpha)) # regularization parameter
@variable(model, X[1:N] in Parameter.(X_data))
@variable(model, Y[1:N] in Parameter.(Y_data))
# expression defining approximation error
@expression(model, e[i=1:N], Y[i] - w * X[i] - b)
# objective minimizing squared error and ridge penalty
@objective(model, Min, 1 / N * dot(e, e) + α * (w^2))
optimize!(model)
return model
end
function optimize_fit_ridge!(model, alpha)
set_parameter_value(model[:α], alpha)
optimize!(model)
return model[:w], model[:b]
end
optimize_fit_ridge! (generic function with 1 method)
Plot the data points and the fitted line for different alpha values
p = Plots.scatter(X, Y; label = nothing, legend = :topleft)
mi, ma = minimum(X), maximum(X)
Plots.title!("Fitted lines and points")
model = build_fit_ridge(X, Y)
for alpha in 0.5:0.5:1.5
local w, b = optimize_fit_ridge!(model, alpha)
ŵ = value(w)
b̂ = value(b)
Plots.plot!(
p,
[mi, ma],
[mi * ŵ + b̂, ma * ŵ + b̂];
label = "alpha=$alpha",
width = 2,
)
end
p
Differentiate
Now that we've solved the problem, we can compute the sensitivity of optimal values of the slope w
with respect to perturbations in the data points (x
,y
).
alpha = 0.4
w, b = optimize_fit_ridge!(model, alpha)
ŵ = value(w)
b̂ = value(b)
0.34840120767054544
Sensitivity with respect to x and y
∇y = zero(X)
∇x = zero(X)
for i in 1:N
# X[i] sensitivity
DiffOpt.empty_input_sensitivities!(model)
DiffOpt.set_forward_parameter(model, model[:X][i], 1.0)
DiffOpt.forward_differentiate!(model)
∇x[i] = DiffOpt.get_forward_variable(model, w)
# Y[i] sensitivity
DiffOpt.empty_input_sensitivities!(model)
DiffOpt.set_forward_parameter(model, model[:Y][i], 1.0)
DiffOpt.forward_differentiate!(model)
∇y[i] = DiffOpt.get_forward_variable(model, w)
end
Visualize point sensitivities with respect to regression points.
p = Plots.scatter(
X,
Y;
color = [dw < 0 ? :blue : :red for dw in ∇x],
markersize = [5 * abs(dw) + 1.2 for dw in ∇x],
label = "",
)
mi, ma = minimum(X), maximum(X)
Plots.plot!(
p,
[mi, ma],
[mi * ŵ + b̂, ma * ŵ + b̂];
color = :blue,
label = "",
)
Plots.title!("Regression slope sensitivity with respect to x")
p = Plots.scatter(
X,
Y;
color = [dw < 0 ? :blue : :red for dw in ∇y],
markersize = [5 * abs(dw) + 1.2 for dw in ∇y],
label = "",
)
mi, ma = minimum(X), maximum(X)
Plots.plot!(
p,
[mi, ma],
[mi * ŵ + b̂, ma * ŵ + b̂];
color = :blue,
label = "",
)
Plots.title!("Regression slope sensitivity with respect to y")
Note the points with less central x
values induce a greater y sensitivity of the slope.
This page was generated using Literate.jl.