Hypothesis Testing Basics

Hypothesis Testing Basics

Purpose and Rationale

Why Do We Need Hypothesis Testing?

Hypothesis testing serves several crucial purposes in statistical analysis:

  1. Making Data-Driven Decisions

    • Provides a structured framework for making decisions based on sample data
    • Helps avoid making decisions based on intuition or anecdotal evidence
    • Allows us to quantify the strength of evidence for or against a claim
  2. Scientific Method Application

    • Enables systematic testing of theories and claims
    • Provides a way to falsify hypotheses (following Popper's philosophy)
    • Allows for replication and verification of results
  3. Risk Management

    • Helps quantify the probability of making incorrect decisions
    • Provides a way to control Type I and Type II errors
    • Allows for setting acceptable levels of risk in decision-making

The Rationale Behind Hypothesis Testing

  1. Statistical Inference

    • We can't observe entire populations, so we use samples
    • Sample results vary due to random sampling
    • Need a method to distinguish between:
      • Real effects/patterns
      • Random variation in the data
  2. Burden of Proof

    • Starts with a skeptical position (null hypothesis)
    • Requires strong evidence to reject the null
    • Protects against false discoveries
    • Similar to "innocent until proven guilty" in legal systems
  3. Quantifying Uncertainty

    • Provides a way to measure the strength of evidence
    • Allows for comparison of different studies
    • Helps in making informed decisions under uncertainty

Introduction to Hypothesis Testing

Aspect Description
Purpose Formal statistical technique to answer binary questions (yes/no) about a Population using sample data
Contrast with Confidence Interval - which estimates parameters or provides ranges of plausible values
Key Feature Compares two competing possibilities about a population parameter

Examples of Hypothesis Testing Questions

Type Example
Fairness Test Is a coin fair? (yes/no)
Treatment Effect Is a new drug more effective than an existing one? (yes/no)
Quality Control Does a manufacturing process meet specifications? (yes/no)
Population Parameter Is the mean height of a population different from a known value?
Relationship Is there a relationship between two categorical variables?

Court Case Analogy

Concept Court Case Hypothesis Testing
Initial Assumption Innocence Null Hypothesis (H0)
Alternative Guilt Alternative Hypothesis (Ha)
Evidence Required Beyond reasonable doubt Small p-value
Decision Not guilty ≠ innocent Fail to reject H0H0 is true
Burden of Proof On prosecution On alternative hypothesis

Core Components: The Hypotheses

Component Description Format
Null Hypothesis (H0) Represents skepticism, status quo, or no effect H0:parameter=hypothesized value
Alternative Hypothesis (Ha) Represents what we aim to find evidence for Ha:parameter>value (Right-tailed)
Ha:parameter<value (Left-tailed)
Ha:parametervalue (Two-tailed)

Detailed Hypothesis Testing Procedure

Step Description Key Components
1 Define Hypotheses * State H0 and Ha clearly
* Choose appropriate test type
* Determine if one or two-tailed
2 Collect Data and Check Conditions * Gather random sample(s)
* Verify sample size conditions
* Check distribution assumptions
* Ensure independence
3 Calculate Test Statistic * Compute sample statistic
* Standardize using Standard error
* Account for degrees of freedom if needed
4 Determine P-value * Identify sampling distribution
* Calculate probability of more extreme results
* Consider test direction (one/two-tailed)
5 Make Decision * Compare p-value to significance level
* State conclusion in context
* Consider practical significance

Key Considerations

Aspect Description
Sample Size Affects power and validity of assumptions
Significance Level Pre-determined threshold (often α = 0.05)
Test Direction One-tailed vs two-tailed affects p-value calculation
Conditions Must be verified for valid inference
Practical Significance Statistical significance ≠ practical importance

Common Misconceptions

Misconception Reality
"Fail to reject" means H0 is true It only means insufficient evidence to reject
Small p-value proves Ha It only provides evidence against H0
Large p-value proves H0 It only means insufficient evidence against H0
Statistical significance = practical importance They are different concepts

Comprehensive Guide to Hypothesis Tests

Section / Term What it refers to How to interpret it Why it matters
Call Formula you passed to lm() (e.g. Net_Tuition ~ Enrollment + Type). Confirms the model you actually fit: response on the left, predictors on the right. Quick specification check
Coefficients block Estimates and tests for each parameter in $$\hat{Y}=b_0+b_1X_1+\dots+b_kX_k$$.
  Estimate β^i: best‑fit coefficient (intercept or slope). Slope: expected change in Y for a 1‑unit rise in that predictor, holding others constant.
Intercept: predicted Y when all X=0 (if X=0 is meaningful).
Direction & magnitude of each relationship
  Std. Error SE(β^i): estimated SD of the sampling distribution of β^i. Smaller SE ⇒ more precise estimate. Precision of estimate
  t value (t=\dfrac{\text{Estimate}}{\text{Std.Error}}). Large |t| ⇒ estimate is many SEs from 0 ⇒ evidence that the true βi0. Test statistic for H0:βi=0
  `Pr(> t|)` Two‑sided p‑value for that t‑test. If p < α (e.g. 0.05), reject H0: the predictor is statistically significant.
Residual standard error σ^: SD of residuals (units of Y). Typical prediction error; lower ⇒ tighter fit. Absolute measure of model accuracy
df (Residuals) nk1: observations minus parameters. Used in t‑ and F‑tests. Calibration of p‑values
Multiple R‑squared R2: proportion of variance in Y explained by the predictors. Ranges 0–1; higher ⇒ stronger explanatory power. Strength of fit
Adjusted R‑squared R2 adjusted for number of predictors. Penalises unnecessary predictors; use to compare models of different sizes. Strength of fit (penalised)
F‑statistic Tests (H_0!:\beta_1=\dots=\beta_k=0) (no slopes). Large F ⇒ at least one slope ≠ 0. Overall model significance
  p‑value (for F) Probability of that F (or larger) under (H_0). p < α ⇒ model is statistically useful overall. Global test of usefulness