Last modified: Jan 24, 2025 By Alexander Williams
Python Statsmodels het_white() Test Guide
What Is Heteroscedasticity?
Heteroscedasticity occurs when regression model errors have unequal variance. It violates OLS assumptions and affects statistical tests. Use het_white() to detect it.
Understanding het_white()
The het_white()
function performs White's test for heteroscedasticity. It uses squared residuals and interactions between predictors. The test helps validate regression assumptions.
How to Use het_white()
First fit a regression model with OLS
and fit()
. Pass the model results to het_white()
. The function returns test statistics and p-values.
import statsmodels.api as sm
from statsmodels.stats.diagnostic import het_white
# Prepare data with add_constant()
X = sm.add_constant(data[['predictor1', 'predictor2']])
y = data['target']
# Fit model
model = sm.OLS(y, X).fit()
# Perform White's test
white_test = het_white(model.resid, model.model.exog)
Interpreting Results
The test returns LM statistic, p-value, and F-statistic. A p-value below 0.05 suggests heteroscedasticity. Check model summary()
for other diagnostics.
White's Test Results:
LM Statistic: 15.23
LM-Test p-value: 0.0043
F-Statistic: 3.85
F-Test p-value: 0.0072
Practical Example
This code shows a full workflow. We use add_constant()
for intercept and check residuals. Combine with plot_regress_exog()
for visual diagnostics.
# Full example with interpretation
print(f"Heteroscedasticity detected: {white_test[1] < 0.05}")
# Output: Heteroscedasticity detected: True
Handling Heteroscedasticity
If detected, consider robust standard errors or data transformations. Use wald_test()
with HC estimators for reliable inference.
Conclusion
het_white() is essential for validating regression assumptions. Combine it with other diagnostics like summary()
and residual plots. Always check test results before finalizing models.