Chi-Square Tests

Purpose and Rationale

Chi-square tests are specifically designed for analyzing categorical variables by comparing observed frequency counts with expected counts derived from a null hypothesis. This comparison helps determine if discrepancies between observed and expected counts are statistically significant or likely due to random chance.

Core Applications

Goodness-of-Fit Test
- Tests single categorical variable
- Compares observed distribution to theoretical model
- Evaluates if sample data matches expected proportions
Test of Independence
- Tests association between two categorical variables
- Evaluates if variables are independent or dependent
- Analyzes patterns in contingency tables

Test Statistics and Calculations

Fundamental Formula

The core calculation for both tests:
$χ^{2} = \sum_{i = 1}^{k} \frac{(O_{i} - E_{i})^{2}}{E_{i}}$

Where:

$k$ = number of categories/cells
$O_{i}$ = observed count
$E_{i}$ = expected count under $H_{0}$

Expected Counts Calculation

Goodness-of-Fit Test
- $E_{i} = N \times p_{i}^{(0)}$
- Where $N$ = total sample size
- $p_{i}^{(0)}$ = proportion specified in $H_{0}$
Test of Independence
- $E_{i j} = \frac{(R o w T o t a l_{i} \times C o l u m n T o t a l_{j})}{G r a n d T o t a l}$
- For each cell in contingency table

Theoretical Foundation

Chi-Square Distribution

Arises from sum of squared standard normal variables
If $Z \sim N (0, 1)$ , then $Z^{2} \sim χ_{1}^{2}$
Shape determined by degrees of freedom:
- Goodness-of-Fit: $d f = k - 1$
- Test of Independence: $d f = (r - 1) (c - 1)$

Hypothesis Testing

Goodness-of-Fit
- $H_{0}$ : Population proportions equal specified values
- $H_{a}$ : At least one proportion differs
- Example: $H_{0} : p_{1} = p_{2} = p_{3} = p_{4} = p_{5} = 0.2$
Test of Independence
- $H_{0}$ : Variables are independent
- $H_{a}$ : Variables are dependent

Applications and Best Practices

Common Uses

Survey analysis and experimental design
Quality control and process monitoring
Pattern analysis in categorical data
Distribution testing and model validation

Key Considerations

Before Analysis
- Verify categorical nature of variables
- Check expected count requirements
- Plan adequate sample size
During Analysis
- Calculate expected counts
- Verify all conditions met
- Compute test statistic
After Analysis
- Interpret p-value in context
- Consider practical significance
- Report findings with effect size

Common Pitfalls

Issue	Problem	Solution
Small expected counts	Invalid test	Combine categories
Ignoring assumptions	Unreliable results	Check all conditions
Overinterpreting	False conclusions	Consider context

Hypothesis Testing Basics - Foundation for understanding tests
Types of Hypothesis Tests - Choosing appropriate test
Statistical Significance - Interpreting results
P-value - Understanding test outcomes
Contingency Tables - Analyzing categorical data
Degrees of Freedom - Understanding test parameters
Effect Size - Measuring practical importance
Sample Size - Planning adequate samples
Multiple Comparisons - Handling multiple tests
ANOVA - Alternative for multiple groups