Polynomial regression#
An obious way to enhance the simple regression model (9) is to add more powers of predictor \(\boldsymbol x\). For example, consider quadratic regression
Now the model has three parameters \(\boldsymbol w = (w_0, w_1, w_2)\), which could be also fitted by optimizing of MSE:
Revisit Boston dataset#
The data look quite suitable for a quadratic regression. Let’s do a simple feature engineering and add new feature of squares. Now the design matrix has two columns:
To fit the linear regression on the new dataset, once again use sklearn
library:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
boston = pd.read_csv("../ISLP_datsets/Boston.csv")
x = boston['lstat']
y = boston['medv']
LR = LinearRegression()
x_reshaped = x.values.reshape(-1, 1)
x_train = np.hstack([x_reshaped, x_reshaped**2])
LR.fit(x_train, y)
print("intercept:", LR.intercept_)
print("coefficients:", LR.coef_)
print("r-score:", LR.score(x_train, y))
print("MSE:", np.mean((LR.predict(x_train) - y) ** 2))
intercept: 42.86200732816936
coefficients: [-2.3328211 0.04354689]
r-score: 0.6407168971636612
MSE: 30.330520075853713
Our metrics have improved, now plot the graphs:
import matplotlib.pyplot as plt
%config InlineBackend.figure_format = 'svg'
plt.scatter(x, y, s=10, c='b', alpha=0.7)
xs = np.linspace(x.min(), x.max(), num=100)
plt.plot(xs, LR.intercept_ + LR.coef_[0]*xs + LR.coef_[1]*xs**2, c='r', lw=2)
plt.xlabel("lstat")
plt.ylabel("medv")
plt.grid(ls=":");
General case#
Of course, the degree of the polynomial can be any number \(m\in\mathbb N\). The model of the polynomial regression is
Q. How many parameters does this model have?
In case of MSE loss the model is fitted via minimizing the function
Exercises#
Find the analytical solution of the optimization task (15).
Find the feature matrix of the polynomial regression model (16).
TODO
Provide examples of dataset, where increasing of degreee is beneficial
Make connection with Runge example from the first chapter
Add some quizzes
Add more datasets (may me even simulated)
Think about cases where the performance of a polynomial model of any degree is poor