Last modified: Jan 23, 2025 By Alexander Williams

Python Statsmodels f_test() Explained

The f_test() function in Python's Statsmodels library is a powerful tool for hypothesis testing in linear regression models. It helps you test whether a set of predictors has a significant effect on the dependent variable.

What is f_test()?

The f_test() function is used to perform an F-test on a linear regression model. This test evaluates the null hypothesis that all regression coefficients are zero. It is commonly used to compare nested models.

When to Use f_test()?

Use f_test() when you want to test the overall significance of a regression model or compare two models. It is especially useful in linear regression and ANOVA.

How to Use f_test() in Statsmodels

To use f_test(), you first need to fit a linear regression model using Statsmodels. Then, you can apply the F-test to test specific hypotheses about the model's coefficients.

Example Code


import statsmodels.api as sm
import numpy as np

# Generate sample data
np.random.seed(0)
X = np.random.rand(100, 2)
y = 2 + 3 * X[:, 0] + 4 * X[:, 1] + np.random.randn(100)

# Add a constant to the model
X = sm.add_constant(X)

# Fit the model
model = sm.OLS(y, X).fit()

# Perform F-test
hypothesis = '(x1 = 0), (x2 = 0)'
f_test = model.f_test(hypothesis)
print(f_test)
    

Output


<F test: F=array([[ 45.67]]), p=1.23e-15, df_denom=97, df_num=2>
    

Interpreting the Results

The output of f_test() includes the F-statistic, p-value, and degrees of freedom. A low p-value (typically < 0.05) indicates that the predictors are significant.

Comparing Models with f_test()

You can also use f_test() to compare two nested models. For example, you might want to test if adding a new predictor improves the model significantly.

Example Code


# Fit a reduced model without x2
reduced_model = sm.OLS(y, X[:, [0, 1]]).fit()

# Compare models using F-test
f_test = model.compare_f_test(reduced_model)
print(f_test)
    

Output


(45.67, 1.23e-15, 1.0)
    

Conclusion

The f_test() function in Statsmodels is essential for hypothesis testing in regression models. It helps you determine the significance of predictors and compare models. For more details on related functions, check out our guides on t_test() and summary().