Time Series Forecasting with Pandas

Cracking time series forecasting with pandas is like finding a map to hidden treasures in your data. Let’s chart the course.

Time Series Basics in Pandas

Pandas shines with time series data, thanks to its DateTimeIndex. Here’s how you start:

import pandas as pd

# Creating a time series DataFrame
dates = pd.date_range('20230101', periods=6)
df = pd.DataFrame({'Sales': [200, 250, 300, 275, 225, 305]}, index=dates)
print(df)

This snippet sets you up with a simple sales dataset, indexed by date.

Rolling Windows for Smoothing

Smoothing out the noise with rolling windows helps see the bigger picture:

# Calculate rolling average
rolling_avg = df.rolling(window=3).mean()
print(rolling_avg)

Resampling for Frequency Conversion

Need monthly data instead of daily? Resampling’s got your back:

# Resample to monthly data and sum up
monthly_sum = df.resample('M').sum()
print(monthly_sum)

Forecasting with ARIMA

For the actual forecasting, you’ll often leave pandas land and use statsmodels, particularly ARIMA, which fits well with pandas DataFrames:

from statsmodels.tsa.arima.model import ARIMA

# Fit the ARIMA model
model = ARIMA(df, order=(1, 1, 1))
model_fit = model.fit()

# Forecast
forecast = model_fit.forecast(steps=3)
print(forecast)

Pandas doesn’t directly do the forecasting but sets you up perfectly to feed your data into powerful forecasting models like ARIMA.

Leave a Reply