Residual Calculation Examples

Learn how to calculate and interpret residuals with real-world examples and step-by-step tutorials for practical applications.

Basic Residual Examples

Example 1: House Price Prediction

Scenario:

You're predicting house prices based on square footage. Your regression model predicts a house should cost $250,000, but it actually sold for $265,000.

Solution:

Step 1: Identify values
  • Observed value (y) = $265,000
  • Predicted value (ลท) = $250,000
Step 2: Apply formula

$$e = y - \hat{y}$$

$$e = 265,000 - 250,000 = 15,000$$

Step 3: Interpret result

Residual = +$15,000

The positive residual means the model underestimated the house price by $15,000. The actual price was higher than predicted.

Example 2: Student Test Scores

Scenario:

A teacher uses study hours to predict test scores for 5 students. Calculate residuals for each student to see how well the model performs.

Data:

Student Study Hours Actual Score Predicted Score Residual
Alice 5 85 82 +3
Bob 8 92 95 -3
Carol 3 78 75 +3
Dave 10 98 101 -3
Eve 6 88 88 0

Analysis:

  • Alice & Carol: Model underestimated their scores (positive residuals)
  • Bob & Dave: Model overestimated their scores (negative residuals)
  • Eve: Perfect prediction (zero residual)
  • Overall: Small residuals suggest good model fit

Regression Analysis Examples

Example 3: Standardized Residuals in Sales Forecasting

Scenario:

A company uses advertising spend to predict monthly sales. You need to identify which months had unusually high or low sales relative to the advertising spend.

Given Information:

  • January sales: $50,000 (predicted: $48,000)
  • Residual standard error: $3,000
  • Raw residual: $50,000 - $48,000 = $2,000

Calculate Standardized Residual:

Step 1: Apply standardized residual formula

$$r = \frac{e}{s} = \frac{2,000}{3,000} = 0.67$$

Step 2: Interpret result

Standardized Residual = 0.67

Since |0.67| < 2, this is considered a normal observation. The sales were slightly higher than expected but within normal variation.

Interpretation Guidelines:

  • |r| < 2: Normal observation
  • 2 โ‰ค |r| < 3: Potential outlier, investigate
  • |r| โ‰ฅ 3: Likely outlier, requires attention

Example 4: Multiple Regression - Car Price Prediction

Scenario:

Predicting used car prices using age, mileage, and brand. A 3-year-old Honda Civic with 30,000 miles sold for $18,500, but your model predicted $17,800.

Model Information:

  • Residual: $18,500 - $17,800 = $700
  • Leverage (h): 0.08
  • MSE: 1,500,000 (standard error: $1,225)

Calculate Studentized Residual:

Step 1: Calculate studentized residual

$$t = \frac{e}{s\sqrt{1-h}} = \frac{700}{1,225 \times \sqrt{1-0.08}}$$

$$t = \frac{700}{1,225 \times 0.959} = \frac{700}{1,175} = 0.596$$

Step 2: Interpret result

Studentized Residual = 0.596

This is well below the threshold of 2, indicating the car's price is consistent with the model's expectations. The slight overprediction is within normal variation.

Model Diagnostic Examples

Example 5: Detecting Outliers in Medical Data

Scenario:

A hospital uses patient age to predict length of stay. One patient's data seems unusual - they stayed much longer than expected.

Patient Data:

  • Age: 45 years
  • Actual stay: 12 days
  • Predicted stay: 4 days
  • Model standard error: 2 days

Outlier Analysis:

Step 1: Calculate raw residual

$$e = 12 - 4 = 8 \text{ days}$$

Step 2: Calculate standardized residual

$$r = \frac{8}{2} = 4$$

Step 3: Evaluate outlier status

Standardized Residual = 4

Since |4| > 3, this is definitely an outlier. The patient stayed significantly longer than the model predicted, suggesting there may have been complications or other factors not captured by age alone.

Next Steps:

  • Investigate the patient's case for unusual circumstances
  • Consider additional predictor variables (e.g., diagnosis, severity)
  • Decide whether to include or exclude this observation
  • Re-evaluate model assumptions and fit

Example 6: Checking Model Assumptions

Scenario:

After fitting a linear regression model, you need to check whether the model assumptions are satisfied using residual analysis.

Key Assumptions to Check:

1. Linearity

Check: Plot residuals vs. fitted values

Good: Random scatter around zero

Bad: Curved pattern suggests non-linearity

2. Homoscedasticity (Constant Variance)

Check: Plot residuals vs. fitted values

Good: Constant spread across all fitted values

Bad: Funnel shape suggests changing variance

3. Normality of Residuals

Check: Q-Q plot of residuals

Good: Points fall close to diagonal line

Bad: Systematic deviations from line

4. Independence

Check: Plot residuals vs. time (if applicable)

Good: No systematic patterns

Bad: Trends or cycles in residuals

Sample Residual Analysis:

For a dataset of 20 observations, standardized residuals should:

  • Have approximately 95% of values between -2 and +2
  • Show no clear patterns when plotted
  • Follow approximately normal distribution

Common Mistakes to Avoid

โŒ Wrong Sign Interpretation

Mistake: Confusing positive and negative residuals

Remember: Positive = underestimated, Negative = overestimated

โŒ Ignoring Scale

Mistake: Comparing raw residuals across different scales

Solution: Use standardized residuals for comparison

โŒ Wrong Standard Error

Mistake: Using incorrect standard error in calculations

Solution: Use residual standard error from your model

โŒ Outlier Overreaction

Mistake: Automatically removing all large residuals

Solution: Investigate outliers before deciding what to do

Practice These Examples

Use our calculator to work through these examples and try your own data.