Sensitivity Analysis of Ridge Regression
This example illustrates the sensitivity analysis of data points in a Ridge Regression problem. The general form of the problem is given below:
\[\begin{split} \begin{array} {ll} \mbox{minimize} & \sum_{i=1}^{N} (y_{i} - w x_{i} - b)^2 + \alpha (w^2 + b^2) \\ \end{array} \end{split}\]
where
w
,b
are slope and intercept of the regressing linex
,y
are the N data pointsα
is the regularization constant
which is equivalent to:
\[\begin{split} \begin{array} {ll} \mbox{minimize} & e^{\top}e + \alpha (w^2) \\ \mbox{s.t.} & e_{i} = y_{i} - w x_{i} - b \quad \quad i=1..N \\ \end{array} \end{split}\]
This tutorial uses the following packages
using JuMP
import DiffOpt
import Random
import Ipopt
import Plots
using LinearAlgebra: dot
Define and solve the problem
Construct a set of noisy (guassian) data points around a line.
Random.seed!(42)
N = 150
w = 2 * abs(randn())
b = rand()
X = randn(N)
Y = w * X .+ b + 0.8 * randn(N);
The helper method fit_ridge
defines and solves the corresponding model. The ridge regression is modeled with quadratic programming (quadratic objective and linear constraints) and solved in generic methods of Ipopt. This is not the standard way of solving the ridge regression problem this is done here for didactic purposes.
function fit_ridge(X, Y, alpha = 0.1)
N = length(Y)
# Initialize a JuMP Model with Ipopt solver
model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer))
set_silent(model)
@variable(model, w) # angular coefficient
@variable(model, b) # linear coefficient
# expression defining approximation error
@expression(model, e[i = 1:N], Y[i] - w * X[i] - b)
# objective minimizing squared error and ridge penalty
@objective(model, Min, 1 / N * dot(e, e) + alpha * (w^2))
optimize!(model)
return model, w, b # return model & variables
end
fit_ridge (generic function with 2 methods)
Plot the data points and the fitted line for different alpha values
p = Plots.scatter(X, Y; label = nothing, legend = :topleft)
mi, ma = minimum(X), maximum(X)
Plots.title!("Fitted lines and points")
for alpha in 0.5:0.5:1.5
local model, w, b = fit_ridge(X, Y, alpha)
ŵ = value(w)
b̂ = value(b)
Plots.plot!(
p,
[mi, ma],
[mi * ŵ + b̂, ma * ŵ + b̂];
label = "alpha=$alpha",
width = 2,
)
end
p
Differentiate
Now that we've solved the problem, we can compute the sensitivity of optimal values of the slope w
with respect to perturbations in the data points (x
,y
).
alpha = 0.4
model, w, b = fit_ridge(X, Y, alpha)
ŵ = value(w)
b̂ = value(b)
0.34840120767054344
We first compute sensitivity of the slope with respect to a perturbation of the independent variable x
.
Recalling that the points $(x_i, y_i)$ appear in the objective function as: (yi - b - w*xi)^2
, the DiffOpt.ForwardObjectiveFunction
attribute must be set accordingly, with the terms multiplying the parameter in the objective. When considering the perturbation of a parameter θ, DiffOpt.ForwardObjectiveFunction()
takes in the expression in the objective that multiplies θ. If θ appears with a quadratic and a linear form: θ^2 a x + θ b y
, then the expression to pass to ForwardObjectiveFunction
is 2θ a x + b y
.
Sensitivity with respect to x and y
∇y = zero(X)
∇x = zero(X)
for i in 1:N
MOI.set(
model,
DiffOpt.ForwardObjectiveFunction(),
2w^2 * X[i] + 2b * w - 2 * w * Y[i],
)
DiffOpt.forward_differentiate!(model)
∇x[i] = MOI.get(model, DiffOpt.ForwardVariablePrimal(), w)
MOI.set(model, DiffOpt.ForwardObjectiveFunction(), (2Y[i] - 2b - 2w * X[i]))
DiffOpt.forward_differentiate!(model)
∇y[i] = MOI.get(model, DiffOpt.ForwardVariablePrimal(), w)
end
Visualize point sensitivities with respect to regression points.
p = Plots.scatter(
X,
Y;
color = [dw < 0 ? :blue : :red for dw in ∇x],
markersize = [5 * abs(dw) + 1.2 for dw in ∇x],
label = "",
)
mi, ma = minimum(X), maximum(X)
Plots.plot!(
p,
[mi, ma],
[mi * ŵ + b̂, ma * ŵ + b̂];
color = :blue,
label = "",
)
Plots.title!("Regression slope sensitivity with respect to x")
p = Plots.scatter(
X,
Y;
color = [dw < 0 ? :blue : :red for dw in ∇y],
markersize = [5 * abs(dw) + 1.2 for dw in ∇y],
label = "",
)
mi, ma = minimum(X), maximum(X)
Plots.plot!(
p,
[mi, ma],
[mi * ŵ + b̂, ma * ŵ + b̂];
color = :blue,
label = "",
)
Plots.title!("Regression slope sensitivity with respect to y")
Note the points with less central x
values induce a greater y sensitivity of the slope.
This page was generated using Literate.jl.