Ridge and Lasso Regression
When Linear Models like Linear Regression (OLS) model starts to show signs of Overfit, we have to consider Generalization.
One of the way to achieve the same is via Regularization.
Above chart depicts the linear relation between Work Experience and Salary.
Here, Lambda (λ) is the Penalty/Regularization Parameter.
As we increase the value for penalty the slope of the line decreases and thus makes the relation more regularized and tries to avoid the Overfit of the Model on the particular independent variable, in this case Work Experience.
Here the penalty means that higher the slope or coefficient of an independent feature more the penalty, so as to decrease the slope and in terms the impact of that feature on the target variable calculation.
- Regularization Types:
- L2 Regularization (Ridge)
- L1 Regularization (Lasso)
Ridge regression also known as L2 penalty uses the L2 norm of a weight vector* to be added as penalty term to the Cost function
Total Cost Function = RSS(W) + λ*||W||22
RSS = SSR = SSE = Sum of Squared Error
Ridge becomes Linear Regression, when λ =0
Lasso regression also known as L1 penalty uses the L1 norm of a weight vector* to be added as penalty term to the Cost function
Total Cost Function = RSS(W) + λ*||W||1
RSS = SSR = SSE = Sum of Squared Error
Lasso becomes Linear Regression, when λ =0
ElasticNet regression is a combination of both Ridge and Lasso regression.
Total Cost Function = RSS(W) + (l1_ratio) * λ*||W||1 + 0.5 * (1 - l1_ratio) * λ*||W||22
RSS = SSR = SSE = Sum of Squared Error
* ElasticNet becomes only Ridge, when l1_ratio=0
* ElasticNet becomes only Lasso, when l1_ratio=1
*weight vector (i.e., W) represents the vector of the weight or slope for all the independent variables.

Leave a Reply