Linear regression loss

Loss is the utility loss when a model makes a prediction. In simple terms, it is the difference between the observed and predicted values $L(y, \hat{y})=y - \hat{y}$ .

For linear models we could add up the loss for each prediction to find the overall loss value. However, this is not a good practice since positive errors (where predictions overshoot the observed value) and negative errors (where predictions undershoot the observed value) can offset each other and give us lower loss.

To prevent model loss values from offsetting, we could take the absolute value ( $L_1$ loss) or square the loss ( $L_2$ loss).

(1) $\begin{equation*} L_1(y \hat{y})=\sum |y - \hat{y}|\end{equation*}$

(2) $\begin{equation*} L_2(y \hat{y})=\sum (y - \hat{y})^2\end{equation*}$

However, these values are not very descriptive; they do not give us a good understanding of how bad or good our model is, so we can get the average loss divided by the number of observations $N$ .

(3) $\begin{equation*} Mean Absolute Error (MAE) = \frac{1}{N} \sum |y - \hat{y}|\end{equation*}$

(4) $\begin{equation*} Mean Squared Error (MSE) =\frac{1}{N} \sum (y - \hat{y})^2\end{equation*}$

>_

More posts

Neural Networs: the fundamental component of modern AI

Understanding Neural Networks: A Deep Dive into Backpropagation

Optimization: Gradient Descent

Why you should never use MSE as a loss function for NN classification problems