Sensitivity Analysis of Ridge Regression
This example illustrates the sensitivity analysis of data points in a Ridge Regression problem. The general form of the problem is given below:
\[\begin{split} \begin{array} {ll} \mbox{minimize} & \sum_{i=1}^{N} (y_{i} - w x_{i} - b)^2 + \alpha (w^2 + b^2) \\ \end{array} \end{split}\]
where
w,bare slope and intercept of the regressing linex,yare the N data pointsαis the regularization constant
which is equivalent to:
\[\begin{split} \begin{array} {ll} \mbox{minimize} & e^{\top}e + \alpha (w^2) \\ \mbox{s.t.} & e_{i} = y_{i} - w x_{i} - b \quad \quad i=1..N \\ \end{array} \end{split}\]
This tutorial uses the following packages
using JuMP
import DiffOpt
import Random
import Ipopt
import Plots
using LinearAlgebra: dotDefine and solve the problem
Construct a set of noisy (guassian) data points around a line.
Random.seed!(42)
N = 150
w_orig = 2 * abs(randn())
b = rand()
X = randn(N)
Y = w_orig * X .+ b + 0.8 * randn(N);The helper method fit_ridge defines and solves the corresponding model. The ridge regression is modeled with quadratic programming (quadratic objective and linear constraints) and solved in generic methods of Ipopt. This is not the standard way of solving the ridge regression problem this is done here for didactic purposes.
function build_fit_ridge(X_data, Y_data, alpha = 0.1)
N = length(Y_data)
# Initialize a JuMP Model with Ipopt solver
# model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer)) # TODO: this is not auto detecting scalar nonlinear function (alpha * w * w)
model = DiffOpt.nonlinear_diff_model(Ipopt.Optimizer)
set_silent(model)
@variable(model, w) # angular coefficient
@variable(model, b) # linear coefficient
@variable(model, α in Parameter(alpha)) # regularization parameter
@variable(model, X[1:N] in Parameter.(X_data))
@variable(model, Y[1:N] in Parameter.(Y_data))
# expression defining approximation error
@expression(model, e[i=1:N], Y[i] - w * X[i] - b)
# objective minimizing squared error and ridge penalty
@objective(model, Min, 1 / N * dot(e, e) + α * (w^2))
optimize!(model)
return model
end
function optimize_fit_ridge!(model, alpha)
set_parameter_value(model[:α], alpha)
optimize!(model)
return model[:w], model[:b]
endoptimize_fit_ridge! (generic function with 1 method)Plot the data points and the fitted line for different alpha values
p = Plots.scatter(X, Y; label = nothing, legend = :topleft)
mi, ma = minimum(X), maximum(X)
Plots.title!("Fitted lines and points")
model = build_fit_ridge(X, Y)
for alpha in 0.5:0.5:1.5
local w, b = optimize_fit_ridge!(model, alpha)
ŵ = value(w)
b̂ = value(b)
Plots.plot!(
p,
[mi, ma],
[mi * ŵ + b̂, ma * ŵ + b̂];
label = "alpha=$alpha",
width = 2,
)
end
pDifferentiate
Now that we've solved the problem, we can compute the sensitivity of optimal values of the slope w with respect to perturbations in the data points (x,y).
alpha = 0.4
w, b = optimize_fit_ridge!(model, alpha)
ŵ = value(w)
b̂ = value(b)0.34840120767054544Sensitivity with respect to x and y
∇y = zero(X)
∇x = zero(X)
for i in 1:N
# X[i] sensitivity
DiffOpt.empty_input_sensitivities!(model)
DiffOpt.set_forward_parameter(model, model[:X][i], 1.0)
DiffOpt.forward_differentiate!(model)
∇x[i] = DiffOpt.get_forward_variable(model, w)
# Y[i] sensitivity
DiffOpt.empty_input_sensitivities!(model)
DiffOpt.set_forward_parameter(model, model[:Y][i], 1.0)
DiffOpt.forward_differentiate!(model)
∇y[i] = DiffOpt.get_forward_variable(model, w)
endVisualize point sensitivities with respect to regression points.
p = Plots.scatter(
X,
Y;
color = [dw < 0 ? :blue : :red for dw in ∇x],
markersize = [5 * abs(dw) + 1.2 for dw in ∇x],
label = "",
)
mi, ma = minimum(X), maximum(X)
Plots.plot!(
p,
[mi, ma],
[mi * ŵ + b̂, ma * ŵ + b̂];
color = :blue,
label = "",
)
Plots.title!("Regression slope sensitivity with respect to x")p = Plots.scatter(
X,
Y;
color = [dw < 0 ? :blue : :red for dw in ∇y],
markersize = [5 * abs(dw) + 1.2 for dw in ∇y],
label = "",
)
mi, ma = minimum(X), maximum(X)
Plots.plot!(
p,
[mi, ma],
[mi * ŵ + b̂, ma * ŵ + b̂];
color = :blue,
label = "",
)
Plots.title!("Regression slope sensitivity with respect to y")Note the points with less central x values induce a greater y sensitivity of the slope.
This page was generated using Literate.jl.