Last modified: Jan 26, 2025 By Alexander Williams

Python Statsmodels adfuller() Guide

The Augmented Dickey-Fuller (ADF) test is a popular statistical test used to determine if a time series is stationary. In Python, the adfuller() function from the Statsmodels library makes it easy to perform this test.

Stationarity is a crucial concept in time series analysis. A stationary time series has properties that do not depend on the time at which the series is observed. This makes it easier to model and predict.

What is the ADF Test?

The ADF test checks the null hypothesis that a unit root is present in a time series sample. If the null hypothesis is rejected, the series is considered stationary.

The test involves estimating the following regression equation:


Δy_t = α + βt + γy_{t-1} + δ1Δy_{t-1} + ... + δpΔy_{t-p} + ε_t

Here, Δy_t is the difference in the series at time t, α is a constant, β is the coefficient on a time trend, and γ is the coefficient on the lagged value of the series.

How to Use adfuller() in Python

To use the adfuller() function, you first need to import it from the Statsmodels library. Here’s a simple example:


from statsmodels.tsa.stattools import adfuller

# Example time series data
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Perform ADF test
result = adfuller(data)

# Print the results
print('ADF Statistic: %f' % result[0])
print('p-value: %f' % result[1])
print('Critical Values:')
for key, value in result[4].items():
    print('\t%s: %.3f' % (key, value))

The output will include the ADF statistic, p-value, and critical values at different significance levels.


ADF Statistic: 0.123456
p-value: 0.987654
Critical Values:
    1%: -3.456
    5%: -2.876
    10%: -2.567

If the p-value is less than the significance level (commonly 0.05), you can reject the null hypothesis and conclude that the series is stationary.

Interpreting the Results

The ADF statistic is compared to critical values. If the statistic is less than the critical value, the null hypothesis is rejected.

The p-value indicates the probability of observing the data if the null hypothesis is true. A low p-value suggests that the series is stationary.

Critical values are thresholds for the ADF statistic at different confidence levels. These values help determine the significance of the test results.

Practical Example

Let’s apply the ADF test to a real-world dataset. We’ll use the Airline Passengers dataset, which is available in the Statsmodels library.


import statsmodels.api as sm

# Load dataset
data = sm.datasets.get_rdataset("AirPassengers", "datasets").data['value']

# Perform ADF test
result = adfuller(data)

# Print the results
print('ADF Statistic: %f' % result[0])
print('p-value: %f' % result[1])
print('Critical Values:')
for key, value in result[4].items():
    print('\t%s: %.3f' % (key, value))

The output will help you determine if the Airline Passengers dataset is stationary.


ADF Statistic: 0.815381
p-value: 0.991880
Critical Values:
    1%: -3.481
    5%: -2.884
    10%: -2.579

In this case, the p-value is greater than 0.05, indicating that the series is not stationary. You may need to apply differencing or other transformations to make it stationary.

Conclusion

The adfuller() function in Python’s Statsmodels library is a powerful tool for testing stationarity in time series data. Understanding how to use and interpret the results of the ADF test is essential for effective time series analysis.

For more advanced time series analysis techniques, consider exploring the seasonal_decompose() function or the Durbin-Watson test.

By mastering these tools, you can build more accurate and reliable models for your time series data.