Residual Calculation Examples
Learn how to calculate and interpret residuals with real-world examples and step-by-step tutorials for practical applications.
Basic Residual Examples
Example 1: House Price Prediction
Scenario:
You're predicting house prices based on square footage. Your regression model predicts a house should cost $250,000, but it actually sold for $265,000.
Solution:
Step 1: Identify values
- Observed value (y) = $265,000
- Predicted value (ลท) = $250,000
Step 2: Apply formula
$$e = y - \hat{y}$$
$$e = 265,000 - 250,000 = 15,000$$
Step 3: Interpret result
Residual = +$15,000
The positive residual means the model underestimated the house price by $15,000. The actual price was higher than predicted.
Example 2: Student Test Scores
Scenario:
A teacher uses study hours to predict test scores for 5 students. Calculate residuals for each student to see how well the model performs.
Data:
Student | Study Hours | Actual Score | Predicted Score | Residual |
---|---|---|---|---|
Alice | 5 | 85 | 82 | +3 |
Bob | 8 | 92 | 95 | -3 |
Carol | 3 | 78 | 75 | +3 |
Dave | 10 | 98 | 101 | -3 |
Eve | 6 | 88 | 88 | 0 |
Analysis:
- Alice & Carol: Model underestimated their scores (positive residuals)
- Bob & Dave: Model overestimated their scores (negative residuals)
- Eve: Perfect prediction (zero residual)
- Overall: Small residuals suggest good model fit
Regression Analysis Examples
Example 3: Standardized Residuals in Sales Forecasting
Scenario:
A company uses advertising spend to predict monthly sales. You need to identify which months had unusually high or low sales relative to the advertising spend.
Given Information:
- January sales: $50,000 (predicted: $48,000)
- Residual standard error: $3,000
- Raw residual: $50,000 - $48,000 = $2,000
Calculate Standardized Residual:
Step 1: Apply standardized residual formula
$$r = \frac{e}{s} = \frac{2,000}{3,000} = 0.67$$
Step 2: Interpret result
Standardized Residual = 0.67
Since |0.67| < 2, this is considered a normal observation. The sales were slightly higher than expected but within normal variation.
Interpretation Guidelines:
- |r| < 2: Normal observation
- 2 โค |r| < 3: Potential outlier, investigate
- |r| โฅ 3: Likely outlier, requires attention
Example 4: Multiple Regression - Car Price Prediction
Scenario:
Predicting used car prices using age, mileage, and brand. A 3-year-old Honda Civic with 30,000 miles sold for $18,500, but your model predicted $17,800.
Model Information:
- Residual: $18,500 - $17,800 = $700
- Leverage (h): 0.08
- MSE: 1,500,000 (standard error: $1,225)
Calculate Studentized Residual:
Step 1: Calculate studentized residual
$$t = \frac{e}{s\sqrt{1-h}} = \frac{700}{1,225 \times \sqrt{1-0.08}}$$
$$t = \frac{700}{1,225 \times 0.959} = \frac{700}{1,175} = 0.596$$
Step 2: Interpret result
Studentized Residual = 0.596
This is well below the threshold of 2, indicating the car's price is consistent with the model's expectations. The slight overprediction is within normal variation.
Model Diagnostic Examples
Example 5: Detecting Outliers in Medical Data
Scenario:
A hospital uses patient age to predict length of stay. One patient's data seems unusual - they stayed much longer than expected.
Patient Data:
- Age: 45 years
- Actual stay: 12 days
- Predicted stay: 4 days
- Model standard error: 2 days
Outlier Analysis:
Step 1: Calculate raw residual
$$e = 12 - 4 = 8 \text{ days}$$
Step 2: Calculate standardized residual
$$r = \frac{8}{2} = 4$$
Step 3: Evaluate outlier status
Standardized Residual = 4
Since |4| > 3, this is definitely an outlier. The patient stayed significantly longer than the model predicted, suggesting there may have been complications or other factors not captured by age alone.
Next Steps:
- Investigate the patient's case for unusual circumstances
- Consider additional predictor variables (e.g., diagnosis, severity)
- Decide whether to include or exclude this observation
- Re-evaluate model assumptions and fit
Example 6: Checking Model Assumptions
Scenario:
After fitting a linear regression model, you need to check whether the model assumptions are satisfied using residual analysis.
Key Assumptions to Check:
1. Linearity
Check: Plot residuals vs. fitted values
Good: Random scatter around zero
Bad: Curved pattern suggests non-linearity
2. Homoscedasticity (Constant Variance)
Check: Plot residuals vs. fitted values
Good: Constant spread across all fitted values
Bad: Funnel shape suggests changing variance
3. Normality of Residuals
Check: Q-Q plot of residuals
Good: Points fall close to diagonal line
Bad: Systematic deviations from line
4. Independence
Check: Plot residuals vs. time (if applicable)
Good: No systematic patterns
Bad: Trends or cycles in residuals
Sample Residual Analysis:
For a dataset of 20 observations, standardized residuals should:
- Have approximately 95% of values between -2 and +2
- Show no clear patterns when plotted
- Follow approximately normal distribution
Common Mistakes to Avoid
โ Wrong Sign Interpretation
Mistake: Confusing positive and negative residuals
Remember: Positive = underestimated, Negative = overestimated
โ Ignoring Scale
Mistake: Comparing raw residuals across different scales
Solution: Use standardized residuals for comparison
โ Wrong Standard Error
Mistake: Using incorrect standard error in calculations
Solution: Use residual standard error from your model
โ Outlier Overreaction
Mistake: Automatically removing all large residuals
Solution: Investigate outliers before deciding what to do
Practice These Examples
Use our calculator to work through these examples and try your own data.