Linear regression loss

Loss is the utility loss when a model makes a prediction. In simple terms, it is the difference between the observed and predicted values L(y, \hat{y})=y - \hat{y}.

For linear models we could add up the loss for each prediction to find the overall loss value. However, this is not a good practice since positive errors (where predictions overshoot the observed value) and negative errors (where predictions undershoot the observed value) can offset each other and give us lower loss.

To prevent model loss values from offsetting, we could take the absolute value (L_1 loss) or square the loss (L_2 loss).

(1)   \begin{equation*} L_1(y \hat{y})=\sum |y - \hat{y}|\end{equation*}

(2)   \begin{equation*} L_2(y \hat{y})=\sum (y - \hat{y})^2\end{equation*}

However, these values are not very descriptive; they do not give us a good understanding of how bad or good our model is, so we can get the average loss divided by the number of observations N.

(3)   \begin{equation*} Mean Absolute Error (MAE) = \frac{1}{N} \sum |y - \hat{y}|\end{equation*}

(4)   \begin{equation*} Mean Squared Error (MSE) =\frac{1}{N} \sum (y - \hat{y})^2\end{equation*}