Last modified: Jan 24, 2025 By Alexander Williams

Python Statsmodels het_white() Test Guide

What Is Heteroscedasticity?

Heteroscedasticity occurs when regression model errors have unequal variance. It violates OLS assumptions and affects statistical tests. Use het_white() to detect it.

Understanding het_white()

The het_white() function performs White's test for heteroscedasticity. It uses squared residuals and interactions between predictors. The test helps validate regression assumptions.

How to Use het_white()

First fit a regression model with OLS and fit(). Pass the model results to het_white(). The function returns test statistics and p-values.

 
import statsmodels.api as sm
from statsmodels.stats.diagnostic import het_white

# Prepare data with add_constant()
X = sm.add_constant(data[['predictor1', 'predictor2']])
y = data['target']

# Fit model
model = sm.OLS(y, X).fit()

# Perform White's test
white_test = het_white(model.resid, model.model.exog)

Interpreting Results

The test returns LM statistic, p-value, and F-statistic. A p-value below 0.05 suggests heteroscedasticity. Check model summary() for other diagnostics.


White's Test Results:
LM Statistic: 15.23
LM-Test p-value: 0.0043
F-Statistic: 3.85
F-Test p-value: 0.0072

Practical Example

This code shows a full workflow. We use add_constant() for intercept and check residuals. Combine with plot_regress_exog() for visual diagnostics.

 
# Full example with interpretation
print(f"Heteroscedasticity detected: {white_test[1] < 0.05}")
# Output: Heteroscedasticity detected: True

Handling Heteroscedasticity

If detected, consider robust standard errors or data transformations. Use wald_test() with HC estimators for reliable inference.

Conclusion

het_white() is essential for validating regression assumptions. Combine it with other diagnostics like summary() and residual plots. Always check test results before finalizing models.