Residual Formulas
Complete mathematical reference for all types of residual calculations, including derivations and practical applications.
Basic Residuals
Simple Residual
$$e_i = y_i - \hat{y}_i$$
Where:
- $e_i$ = residual for observation $i$
- $y_i$ = observed value for observation $i$
- $\hat{y}_i$ = predicted value for observation $i$
Applications:
- Basic model evaluation
- Error measurement
- Initial diagnostic analysis
- Outlier identification
Residual Sum of Squares (RSS)
$$RSS = \sum_{i=1}^{n} e_i^2 = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$
Alternative notations:
$$SSE = \sum_{i=1}^{n} e_i^2$$
(Sum of Squared Errors)
Purpose:
Measures the total amount of unexplained variation in the model. Lower RSS indicates better fit.
Mean Squared Error (MSE)
$$MSE = \frac{RSS}{n-p} = \frac{\sum_{i=1}^{n} e_i^2}{n-p}$$
Where:
- $n$ = number of observations
- $p$ = number of parameters (including intercept)
- $n-p$ = degrees of freedom
Standardized Residuals
Standard Residual
$$r_i = \frac{e_i}{s}$$
Where:
- $r_i$ = standardized residual for observation $i$
- $e_i$ = raw residual for observation $i$
- $s$ = residual standard error = $\sqrt{MSE}$
Interpretation Guidelines:
$|r_i| < 2$ | Normal observation |
$2 \leq |r_i| < 3$ | Potential outlier |
$|r_i| \geq 3$ | Likely outlier |
Internally Studentized Residual
$$r_i = \frac{e_i}{s\sqrt{1-h_i}}$$
Where:
- $h_i$ = leverage of observation $i$
- $s$ = residual standard error
- $1-h_i$ = adjustment for leverage
Leverage Formula:
$$h_i = \mathbf{x}_i^T(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{x}_i$$
For simple linear regression: $h_i = \frac{1}{n} + \frac{(x_i - \bar{x})^2}{\sum(x_j - \bar{x})^2}$
Studentized Residuals
Externally Studentized Residual
$$t_i = \frac{e_i}{s_{(i)}\sqrt{1-h_i}}$$
Where:
- $s_{(i)}$ = residual standard error calculated without observation $i$
- $h_i$ = leverage of observation $i$
Calculation Steps:
- Remove observation $i$ from dataset
- Fit model to remaining $n-1$ observations
- Calculate $s_{(i)}$ from this reduced model
- Apply formula above
Distribution:
$t_i$ follows a $t$-distribution with $n-p-1$ degrees of freedom
Relationship Between Residual Types
$$t_i = r_i\sqrt{\frac{n-p-1}{n-p-r_i^2}}$$
This shows the relationship between internally studentized ($r_i$) and externally studentized ($t_i$) residuals.
Practical Note:
When $r_i$ is small, $t_i \approx r_i$. The difference becomes significant for large residuals.
Advanced Formulas
PRESS Residual
$$e_{(i)} = y_i - \hat{y}_{(i)}$$
Where:
- $e_{(i)}$ = PRESS residual for observation $i$
- $\hat{y}_{(i)}$ = prediction for $y_i$ using model fitted without observation $i$
Relationship to Standard Residual:
$$e_{(i)} = \frac{e_i}{1-h_i}$$
PRESS Statistic:
$$PRESS = \sum_{i=1}^{n} e_{(i)}^2$$
Used for model validation and comparison
Recursive Residual
$$w_t = \frac{y_t - \mathbf{x}_t^T\hat{\boldsymbol{\beta}}_{t-1}}{\sqrt{1 + \mathbf{x}_t^T(\mathbf{X}_{t-1}^T\mathbf{X}_{t-1})^{-1}\mathbf{x}_t}}$$
Where:
- $w_t$ = recursive residual at time $t$
- $\hat{\boldsymbol{\beta}}_{t-1}$ = parameter estimates using first $t-1$ observations
- $\mathbf{X}_{t-1}$ = design matrix for first $t-1$ observations
Applications:
- Structural break testing
- Model stability analysis
- Sequential data analysis
Partial Residual
$$e_i^{(j)} = e_i + \hat{\beta}_j x_{ij}$$
Where:
- $e_i^{(j)}$ = partial residual for variable $j$ and observation $i$
- $e_i$ = ordinary residual
- $\hat{\beta}_j$ = estimated coefficient for variable $j$
- $x_{ij}$ = value of variable $j$ for observation $i$
Purpose:
Used in partial residual plots to check linearity of individual predictors in multiple regression.
Quick Reference Table
Residual Type | Formula | Main Use | Outlier Threshold |
---|---|---|---|
Raw | $e_i = y_i - \hat{y}_i$ | Basic analysis | Context dependent |
Standardized | $r_i = e_i / s$ | Scale-free comparison | $|r_i| > 2$ |
Studentized (Internal) | $r_i = e_i / (s\sqrt{1-h_i})$ | Leverage adjustment | $|r_i| > 2$ |
Studentized (External) | $t_i = e_i / (s_{(i)}\sqrt{1-h_i})$ | Robust outlier detection | $|t_i| > 2.5$ |
PRESS | $e_{(i)} = e_i / (1-h_i)$ | Cross-validation | Large PRESS values |
Apply These Formulas
Use our calculator to compute these residuals automatically with your data.