Hypothesis Testing Key Concepts
Hypothesis Testing Key Concepts
Purpose and Rationale
Why These Concepts Matter
The key concepts in hypothesis testing serve several essential purposes:
-
Understanding Test Mechanics
- Test statistics quantify the evidence against the null hypothesis
- Null distributions provide the theoretical framework for decision-making
- P-values measure the strength of evidence in a standardized way
-
Making Valid Inferences
- Proper test selection ensures appropriate analysis
- Understanding conditions ensures valid results
- Error control helps manage decision risks
-
Interpreting Results Correctly
- Clear framework for decision-making
- Standardized way to communicate findings
- Basis for comparing different studies
The Rationale Behind Key Concepts
-
Test Statistics
- Why we need them:
- Standardize different types of evidence
- Account for sample size and variability
- Provide a common scale for comparison
- Quantify deviation from null hypothesis
- Simplify evaluation without full simulation
- How they work:
- Measure distance from null hypothesis
- Account for sampling variability
- Follow known probability distributions
- Convert raw differences to standardized units
- Enable probability calculations
-
Null Distributions
- Why they're important:
- Provide the theoretical basis for p-values
- Help understand what to expect by chance
- Enable calculation of probabilities
- Define what constitutes "extreme" results
- Allow for standardized decision making
- How they're used:
- Define what "extreme" means
- Determine critical values
- Calculate p-values
- Guide test statistic interpretation
- Support decision making
-
Error Control
- Why it matters:
- Helps manage decision risks
- Provides framework for sample size planning
- Enables comparison of different studies
- Balances Type I and Type II errors
- Guides practical decision making
- How it works:
- Balances Type I and Type II errors
- Considers practical consequences
- Guides decision thresholds
- Helps determine sample sizes
- Supports risk management
Test Selection Guide
Question Type |
Test to Use |
Key Conditions |
Single Mean |
t-test or z-test |
Normal data or |
Difference in Means |
Two-sample t-test |
Independent samples |
Single Proportion |
z-test |
, |
Difference in Proportions |
Two-proportion z-test |
Independent samples |
Test Statistics and Evidence
Purpose of Test Statistics
Aspect |
Description |
Importance |
Standardization |
Converts raw differences to common scale |
Enables comparison across studies |
Evidence Quantification |
Measures strength of evidence against H₀ |
Provides objective basis for decisions |
Variability Accounting |
Incorporates sample size and spread |
Ensures valid inference |
Distribution Basis |
Links to known probability distributions |
Enables p-value calculation |
Understanding P-values
Concept |
Description |
Key Points |
Definition |
Probability of more extreme results under H₀ |
Not probability H₀ is true |
Interpretation |
Strength of evidence against H₀ |
Smaller p-value = stronger evidence |
Calculation |
Based on test statistic and null distribution |
Depends on Hₐ direction |
Usage |
Two approaches: |
1. Strength of evidence 2. Decision making |
P-value Interpretation Framework
Approach |
Method |
When to Use |
Strength of Evidence |
Direct p-value interpretation |
Research reporting |
Decision Making |
Compare to α level |
Practical applications |
P-value Guidelines
P-value Range |
Traditional Interpretation |
Better Practice |
|
Strong evidence |
Report exact p-value |
|
Moderate evidence |
Consider practical significance |
|
Weak evidence |
Discuss uncertainty |
|
No evidence |
Note limitations |
Common Misconceptions About P-values
Misconception |
Reality |
Explanation |
P-value = probability H₀ is true |
False |
P-value assumes H₀ is true |
P-value = probability of random chance |
False |
P-value is conditional on H₀ |
Small p-value proves Hₐ |
False |
Only provides evidence against H₀ |
Large p-value proves H₀ |
False |
Only indicates insufficient evidence |
Anatomy of Test Statistics
Type |
Formula |
When to Use |
Null Distribution |
Z-statistic |
|
Large samples, known |
|
T-statistic |
|
Small samples or unknown |
|
Components of Test Statistics
Component |
Description |
Importance |
Numerator |
Distance between observed and null |
Measures effect size |
Denominator |
Standard error |
Measures precision |
Absolute Value |
Distance from zero |
Strength of evidence |
Null Distribution Properties
Distribution |
Used For |
Key Features |
Normal Distribution |
Large samples, proportions |
Symmetric, uses z-scores |
t-distribution |
Small samples, means |
Heavier tails, uses df |
Decision Making Framework
Types of Errors
Decision vs Reality |
True |
False |
Reject |
Type I Error () |
Correct Decision |
Fail to reject |
Correct Decision |
Type II Error () |
Error Control Parameters
Parameter |
Symbol |
Typical Values |
Meaning |
Significance Level |
|
0.05, 0.01 |
Type I error rate |
Power |
|
0.80, 0.90 |
Correct rejection rate |
Sample Size |
|
Varies |
Affects both errors |
Best Practices
Reporting Checklist
Component |
What to Include |
Why Important |
Hypotheses |
Clear and |
Defines research question |
Conditions |
All assumptions checked |
Validates test choice |
Test Statistic |
Formula and calculation |
Shows process |
P-value |
Exact value |
Indicates evidence strength |
Effect Size |
Practical difference |
Shows practical significance |
Confidence Interval |
Range estimate |
Shows precision |
Common Pitfalls to Avoid
Pitfall |
Consequence |
Prevention |
Multiple Testing |
Increased Type I error |
Adjust level |
P-hacking |
Invalid conclusions |
Pre-specify analyses |
Binary Decisions |
Loss of information |
Report effect sizes |
Ignoring Assumptions |
Invalid results |
Check conditions |