Regression

What is linear regression

Regression: Technique to model linear relationships between two quantitative variables.
Goal: Predict values of a response variable (dependent) using an explanatory variable (independent/predictor).

Equations:


2. Key Calculations


3. Residuals & Model Fit


4. Assumptions & Pitfalls

  1. Linearity: Relationship must be approximately linear (verify with scatterplot).
  2. Extrapolation: Avoid predictions outside the observed data range.
    • Example: Predicting son's height for a father with 0 inches is nonsensical.
  3. Correlation ≠ Causation: High R2 does not imply x causes y.
  4. Outliers: Can disproportionately affect slope and intercept.

5. Comparison with Correlation

Aspect Correlation (r) Regression
Symmetry Symmetric (rxy=ryx) Asymmetric (switching x and y changes the line)
Units Unitless Slope has units (y-units per x-unit)
Purpose Measures association Predicts y from x

7. Complementary Tools

Multiple Linear Regression (MLR)

MLR extends simple linear regression by incorporating multiple independent variables to predict a single dependent variable.

Key Components:

Key Measures:

Advanced Concepts:

Regression Error Analysis

Regression error refers to the discrepancy between observed values and those predicted by the regression model.

Key Error Metrics:

Diagnostic Tools:

Common Error Issues:

Model Validation:

Connections Between ANOVA and Regression