Forecasting with Exogenous Regressors¶

This notebook provides examples of the accepted data structures for passing the expected value of exogenous variables when these are included in the mean. For example, consider an AR(1) with 2 exogenous variables. The mean dynamics are

\[Y_t = \phi_0 + \phi_1 Y_{t-1} + \beta_0 X_{0,t} + \beta_1 X_{1,t} + \epsilon_t.\]

The \(h\)-step forecast, \(E_{T}[Y_{t+h}]\), depends on the conditional expectation of \(X_{0,T+h}\) and \(X_{1,T+h}\),

\[E_{T}[Y_{T+h}] = \phi_0 + \phi_1 E_{T}[Y_{T+h-1}] + \beta_0 E_{T}[X_{0,T+h}] +\beta_1 E_{T}[X_{1,T+h}]\]

where \(E_{T}[Y_{T+h-1}]\) has been recursively computed.

In order to construct forecasts up to some horizon \(h\), it is necessary to pass \(2\times h\) values (\(h\) for each series). If using the features of forecast that allow many forecast to be specified, it necessary to supply \(n \times 2 \times h\) values.

There are two general purpose data structures that can be used for any number of exogenous variables and any number steps ahead:

dict - The values can be pass using a dict where the keys are the variable names and the values are 2-dimensional arrays. This is the most natural generalization of a pandas DataFrame to 3-dimensions.
array - The vales can alternatively be passed as a 3-d NumPy array where dimension 0 tracks the regressor index, dimension 1 is the time period and dimension 2 is the horizon.

When a model contains a single exogenous regressor it is possible to use a 2-d array or DataFrame where dim0 tracks the time period where the forecast is generated and dimension 1 tracks the horizon.

In the special case where a model contains a single regressor and the horizon is 1, then a 1-d array or pandas Series can be used.

[1]:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

sns.set_style("darkgrid")
plt.rc("figure", figsize=(16, 6))
plt.rc("savefig", dpi=90)
plt.rc("font", family="sans-serif")
plt.rc("font", size=14)

Simulating data¶

Two \(X\) variables are simulated and are assumed to follow independent AR(1) processes. The data is then assumed to follow an ARX(1) with 2 exogenous regressors and GARCH(1,1) errors.

[2]:

from arch.univariate import ARX, GARCH, ZeroMean, arch_model

burn = 250

x_mod = ARX(None, lags=1)
x0 = x_mod.simulate([1, 0.8, 1], nobs=1000 + burn).data
x1 = x_mod.simulate([2.5, 0.5, 1], nobs=1000 + burn).data

resid_mod = ZeroMean(volatility=GARCH())
resids = resid_mod.simulate([0.1, 0.1, 0.8], nobs=1000 + burn).data

phi1 = 0.7
phi0 = 3
y = 10 + resids.copy()
for i in range(1, y.shape[0]):
    y[i] = phi0 + phi1 * y[i - 1] + 2 * x0[i] - 2 * x1[i] + resids[i]

x0 = x0.iloc[-1000:]
x1 = x1.iloc[-1000:]
y = y.iloc[-1000:]
y.index = x0.index = x1.index = np.arange(1000)

Plotting the data¶

[3]:

ax = pd.DataFrame({"ARX": y}).plot(legend=False)
ax.legend(frameon=False)
_ = ax.set_xlim(0, 999)

../_images/univariate_univariate_forecasting_with_exogenous_variables_5_0.png

Forecasting the X values¶

The forecasts of \(Y\) depend on forecasts of \(X_0\) and \(X_1\). Both of these follow simple AR(1), and so we can construct the forecasts for all time horizons. Note that the value in position [i,j] is the time-i forecast for horizon j+1.

[4]:

x0_oos = np.empty((1000, 10))
x1_oos = np.empty((1000, 10))
for i in range(10):
    if i == 0:
        last = x0
    else:
        last = x0_oos[:, i - 1]
    x0_oos[:, i] = 1 + 0.8 * last
    if i == 0:
        last = x1
    else:
        last = x1_oos[:, i - 1]
    x1_oos[:, i] = 2.5 + 0.5 * last

x1_oos[-1]

[4]:

array([5.35651117, 5.17825558, 5.08912779, 5.0445639 , 5.02228195,
       5.01114097, 5.00557049, 5.00278524, 5.00139262, 5.00069631])

Fitting the model¶

Next, the model is fit. The parameters are precisely estimated.

[5]:

exog = pd.DataFrame({"x0": x0, "x1": x1})
mod = arch_model(y, x=exog, mean="ARX", lags=1)
res = mod.fit(disp="off")
print(res.summary())

                          AR-X - GARCH Model Results
==============================================================================
Dep. Variable:                   data   R-squared:                       0.989
Mean Model:                      AR-X   Adj. R-squared:                  0.989
Vol Model:                      GARCH   Log-Likelihood:               -1392.63
Distribution:                  Normal   AIC:                           2799.26
Method:            Maximum Likelihood   BIC:                           2833.60
                                        No. Observations:                  999
Date:                Tue, Oct 21 2025   Df Residuals:                      995
Time:                        08:39:40   Df Model:                            4
                               Mean Model
========================================================================
                 coef    std err          t      P>|t|  95.0% Conf. Int.
------------------------------------------------------------------------
Const          3.1413      0.186     16.922  3.123e-64 [  2.777,  3.505]
data[1]        0.7037  4.010e-03    175.490      0.000 [  0.696,  0.712]
x0             1.9572  2.491e-02     78.572      0.000 [  1.908,  2.006]
x1            -1.9857  2.751e-02    -72.180      0.000 [ -2.040, -1.932]
                              Volatility Model
===========================================================================
                 coef    std err          t      P>|t|     95.0% Conf. Int.
---------------------------------------------------------------------------
omega          0.0518  3.153e-02      1.644      0.100 [-9.958e-03,  0.114]
alpha[1]       0.0588  2.142e-02      2.744  6.065e-03  [1.680e-02,  0.101]
beta[1]        0.8881  4.666e-02     19.032  9.317e-81    [  0.797,  0.980]
===========================================================================

Covariance estimator: robust

Using a `dict`¶

The first approach uses a dict to pass the two variables. The key consideration here is the the keys of the dictionary must exactly match the variable names (x0 and x1 here). The dictionary here contains only the final row of the forecast values since forecast will only make forecasts beginning from the final in-sample observation by default.

Using `DataFrame`¶

While these examples make use of NumPy arrays, these can be DataFrames. This allows the index to be used to track the forecast origination point, which can be a helpful device.

[6]:

exog_fcast = {"x0": x0_oos[-1:], "x1": x1_oos[-1:]}
forecasts = res.forecast(horizon=10, x=exog_fcast)
forecasts.mean.T.plot()

[6]:

<Axes: >

../_images/univariate_univariate_forecasting_with_exogenous_variables_11_1.png

Using an `array`¶

An array can alternatively be used. This frees the restriction on matching the variable names although the order must match instead. The forecast values are 2 (variables) by 1 (forecast) by 10 (horizon).

[7]:

exog_fcast = np.array([x0_oos[-1:], x1_oos[-1:]])
print(f"The shape is {exog_fcast.shape}")
array_forecasts = res.forecast(horizon=10, x=exog_fcast)
print(array_forecasts.mean - forecasts.mean)

The shape is (2, 1, 10)
     h.01  h.02  h.03  h.04  h.05  h.06  h.07  h.08  h.09  h.10
999   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0

Producing multiple forecasts¶

forecast can produce multiple forecasts using the same fit model. Here the model is fit to the first 500 observations and then forecasting for the remaining values are produced. It must be the case that the x values passed for forecast have the same number of rows as the table of forecasts produced.

[8]:

res = mod.fit(disp="off", last_obs=500)
exog_fcast = {"x0": x0_oos[-500:], "x1": x1_oos[-500:]}
multi_forecasts = res.forecast(start=500, horizon=10, x=exog_fcast)
multi_forecasts.mean.tail(10)

[8]:

	h.01	h.02	h.03	h.04	h.05	h.06	h.07	h.08	h.09	h.10
990	2.810961	4.627813	6.039889	7.106607	7.899030	8.481916	8.908338	9.219508	9.446465	9.612158
991	4.112977	5.383913	6.472223	7.345412	8.022031	8.536168	8.922596	9.211409	9.426790	9.587416
992	1.186871	1.712670	2.639405	3.670264	4.665110	5.563810	6.346565	7.013311	7.573012	8.038134
993	9.582463	10.976411	11.695082	11.996125	12.047212	11.956355	11.792086	11.596926	11.396397	11.205027
994	16.197513	17.389207	17.581652	17.220315	16.579488	15.825264	15.055257	14.323941	13.658900	13.071286
995	14.266653	14.994621	15.161309	14.976058	14.591253	14.111154	13.603290	13.108823	12.650816	12.240392
996	10.890324	10.824497	10.811006	10.793515	10.757830	10.705632	10.643113	10.576480	10.510511	10.448367
997	7.094744	7.644645	8.246787	8.767816	9.175127	9.475859	9.689676	9.837427	9.937054	10.002638
998	6.392929	6.933642	7.578298	8.165794	8.647757	9.022617	9.305498	9.515166	9.668907	9.780940
999	7.380380	7.907268	8.440039	8.890331	9.240578	9.500413	9.687370	9.818992	9.910092	9.972225

The plot of the final 5 forecast paths shows the the mean reversion of the process.

[9]:

_ = multi_forecasts.mean.tail().T.plot()

../_images/univariate_univariate_forecasting_with_exogenous_variables_17_0.png

The previous example made use of dictionaries where each of the values was a 500 (number of forecasts) by 10 (horizon) array. The alternative format can be used where x is a 3-d array with shape 2 (variables) by 500 (forecasts) by 10 (horizon).

[10]:

exog_fcast = np.array([x0_oos[-500:], x1_oos[-500:]])
print(exog_fcast.shape)
array_multi_forecasts = res.forecast(start=500, horizon=10, x=exog_fcast)
np.max(np.abs(array_multi_forecasts.mean - multi_forecasts.mean))

(2, 500, 10)

[10]:

np.float64(0.0)

`x` input array sizes¶

While the natural shape of the x data is the number of forecasts, it is also possible to pass an x that has the same shape as the y used to construct the model. The may simplify tracking the origin points of the forecast. Values are are not needed are ignored. In this example, the out-of-sample values are 2 by 1000 (original number of observations) by 10. Only the final 500 are used.

WARNING

Other sizes are not allowed. The size of the out-of-sample data must either match the original data size or the number of forecasts.

[11]:

exog_fcast = np.array([x0_oos, x1_oos])
print(exog_fcast.shape)
array_multi_forecasts = res.forecast(start=500, horizon=10, x=exog_fcast)
np.max(np.abs(array_multi_forecasts.mean - multi_forecasts.mean))

(2, 1000, 10)

[11]:

np.float64(0.0)

Special Cases with a single `x` variable¶

When a model consists of a single exogenous regressor, then x can be a 1-d or 2-d array (or Series or DataFrame).

[12]:

mod = arch_model(y, x=exog.iloc[:, :1], mean="ARX", lags=1)
res = mod.fit(disp="off")
print(res.summary())

                          AR-X - GARCH Model Results
==============================================================================
Dep. Variable:                   data   R-squared:                       0.938
Mean Model:                      AR-X   Adj. R-squared:                  0.938
Vol Model:                      GARCH   Log-Likelihood:               -2271.77
Distribution:                  Normal   AIC:                           4555.53
Method:            Maximum Likelihood   BIC:                           4584.97
                                        No. Observations:                  999
Date:                Tue, Oct 21 2025   Df Residuals:                      996
Time:                        08:39:40   Df Model:                            3
                               Mean Model
========================================================================
                 coef    std err          t      P>|t|  95.0% Conf. Int.
------------------------------------------------------------------------
Const         -7.0992      0.282    -25.200 4.050e-140 [ -7.651, -6.547]
data[1]        0.7447  1.026e-02     72.560      0.000 [  0.725,  0.765]
x0             1.9261  5.974e-02     32.241 4.645e-228 [  1.809,  2.043]
                             Volatility Model
==========================================================================
                 coef    std err          t      P>|t|    95.0% Conf. Int.
--------------------------------------------------------------------------
omega          4.0965      0.828      4.945  7.615e-07   [  2.473,  5.720]
alpha[1]       0.1008  4.245e-02      2.374  1.758e-02 [1.759e-02,  0.184]
beta[1]        0.1648      0.147      1.122      0.262   [ -0.123,  0.453]
==========================================================================

Covariance estimator: robust

These two examples show that both formats can be used.

[13]:

forecast_1d = res.forecast(horizon=10, x=x0_oos[-1])
forecast_2d = res.forecast(horizon=10, x=x0_oos[-1:])
print(forecast_1d.mean - forecast_2d.mean)

## Simulation-forecasting

mod = arch_model(y, x=exog, mean="ARX", lags=1, power=1.0)
res = mod.fit(disp="off")

     h.01  h.02  h.03  h.04  h.05  h.06  h.07  h.08  h.09  h.10
999   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0

Simulation¶

forecast supports simulating paths. When forecasting a model with exogenous variables, the same value is used to in all mean paths. If you wish to also simulate the paths of the x variables, these need to generated and then passed inside a loop.

Static out-of-sample `x`¶

This first example shows that variance of the paths when the same x values are used in the forecast. There is a sense the out-of-sample x are treated as deterministic.

[14]:

x = {"x0": x0_oos[-1], "x1": x1_oos[-1]}
sim_fixedx = res.forecast(horizon=10, x=x, method="simulation", simulations=100)
sim_fixedx.simulations.values.std(1)

[14]:

array([[1.26259582, 1.5319428 , 1.48544959, 1.53999558, 1.46201176,
        1.60078293, 1.67420261, 1.57189746, 1.50087298, 1.41740919]])

Simulating the out-of-sample `x`¶

This example simulates distinct paths for the two exogenous variables and then simulates a single path. This is then repeated 100 times. We see that variance is much higher when we account for variation in the x data.

[15]:

from numpy.random import RandomState


def sim_ar1(params: np.ndarray, initial: float, horizon: int, rng: RandomState):
    out = np.zeros(horizon)
    shocks = rng.standard_normal(horizon)
    out[0] = params[0] + params[1] * initial + shocks[0]
    for i in range(1, horizon):
        out[i] = params[0] + params[1] * out[i - 1] + shocks[i]
    return out


simulations = []
rng = RandomState(20210301)
for _ in range(100):
    x0_sim = sim_ar1(np.array([1, 0.8]), x0.iloc[-1], 10, rng)
    x1_sim = sim_ar1(np.array([2.5, 0.5]), x1.iloc[-1], 10, rng)
    x = {"x0": x0_sim, "x1": x1_sim}
    fcast = res.forecast(horizon=10, x=x, method="simulation", simulations=1)
    simulations.append(fcast.simulations.values)

Finally the standard deviation is quite a bit larger. This is a most accurate value fo the long-run variance of the forecast residuals which should account for dynamics in the model and any exogenous regressors.

[16]:

joined = np.concatenate(simulations, 1)
joined.std(1)

[16]:

array([[3.21676168, 5.01133195, 6.48934838, 7.73887243, 8.47137387,
        8.71586675, 8.73327487, 8.78720087, 8.96146416, 9.24452755]])

Conditional Mean Alignment vs. Forecast Alignment¶

When fitting a model with exogenous variables, the data are aligned so that the values in x[j] are used to compute the conditional mean of y[j]. For example, in the case of an AR(1)-X, the model is

\[Y_t = \phi_0 + \phi_1 Y_{t-1} + \beta X_{t} + \epsilon_t.\]

We can recover the conditional mean by subtracting the residuals from the original data. When we do this we see that the conditional mean of observation 0 is missing since we need one lag of \(Y\) to fit the model.

[17]:

mod = arch_model(y, x=exog, mean="ARX", lags=1)
res = mod.fit(disp="off")
y - res.resid

[17]:

0            NaN
1      18.090045
2      18.590375
3      17.212436
4      16.834087
         ...
995    15.000376
996    11.677614
997     9.120405
998     5.780957
999     6.171040
Length: 1000, dtype: float64

Conditional Mean uses target alignment¶

When modeling the conditional mean in an AR-X, HAR-X, or LS model, the \(X\) data is target-aligned. This requires that when modeling the mean of y[t], the correct values of \(X\) must appear in x[t]. Mathematically, the \(X\) matrix used when estimating a model should have the structure (using the Python indexing convention of a T-element data set having indices 0, 1, …, T-1):

\[\begin{split}\left[\begin{array}{c} X_{0}\\ X_{1}\\ \vdots\\ X_{T-1} \end{array}\right]\end{split}\]

`forecast` uses origin alignment¶

Forecasting with \(X\) values aligns them differently. When producing a 1-step-ahead forecast for \(Y_{t+1}\) using information available at time \(t\), the \(X\) values used for this forecast must appear in for t. This is needed since when once wants to produce true out-of-sample forecasts (see below), it must be the case that the final row of x passed forecast must all occur after the final time stamp of the most recent \(Y\) value. Mathematically, the \(X\) matrix used in forecasting should have the following structure (using Python indexing convention so that a \(T\) observation dataset will have indices 0, 1, …, T-1).

\[\begin{split}\left[\begin{array}{cccc} E\left[X_{1}|\mathcal{F}_0\right] & E\left[X_{2}|\mathcal{F}_0\right] & \ldots & E\left[X_{h|\mathcal{F}_0}\right]\\ E\left[X_{2}|\mathcal{F}_1\right] & E\left[X_{3}|\mathcal{F}_0\right] & \ldots & E\left[X_{h+1}|\mathcal{F}_1\right]\\ \vdots & \vdots & \vdots & \vdots\\ E\left[X_{T}|\mathcal{F}_{T-1}\right] & E\left[X_{T+1}|\mathcal{F}_{T-1}\right] & \ldots & E\left[X_{T+h-1}|\mathcal{F}_{T-1}\right] \end{array}\right]\end{split}\]

where \(|\mathcal{F}_{s}\) is the time-\(s\) information set.

If you use the same x value in the model when forecasting, you will see different values due to this alignment difference. Naively using the same x values ie equivalent to setting

\[E\left[X_{s}|\mathcal{F}_{s-1} \right] = X_{s-1}\]

In general this would not be correct when forecasting, and will always produce forecasts that differ from the conditional mean. In order to recover the conditional mean using the forecast function, it is necessary to shift the \(X\) values by -1, so that once shifted, the x values will have the relationship

\[E\left[X_{s}|\mathcal{F}_{s-1} \right] = X_{s} .\]

Here we shift the \(X\) data by -1 so x[s] is treated as being in the information set for y[s-1]. Also, note that the final forecast is NaN. Conceptually this must be the case because the value of \(X\) at 999 should be ahead of 999 (i.e., at observation 1,000), and we do not have this value.

[18]:

exog_dict = {col: exog[[col]].shift(-1) for col in exog}
fcast = res.forecast(horizon=1, x=exog_dict, start=0)
fcast.mean

[18]:

	h.1
0	18.090045
1	18.590375
2	17.212436
3	16.834087
4	14.765855
...	...
995	11.677614
996	9.120405
997	5.780957
998	6.171040
999	NaN

1000 rows × 1 columns

(Nearly) out-of-sample forecasts¶

These “in-sample” forecasts are not really forecasts at all but are just fitted values with a different alignment. If you want real (nearly) out-of-sample forecasts\(\dagger\), it is necessary to replace the actual values of \(X\) with their conditional expectation. This can be done by taking the fitted values from AR(1) models of the \(X\) variables.

\(\dagger\) These are not true out-of-sample since the parameters were estimated using data from that same range of indices where these forecasts target. True out-of-sample requires both using forecast \(X\) values and parameters estimated without the period being forecasted.

[19]:

res0 = ARX(exog["x0"], lags=1).fit()
res1 = ARX(exog["x1"], lags=1).fit()
forecast_x = pd.concat(
    [res0.forecast(start=0).mean, res1.forecast(start=0).mean], axis=1
)
forecast_x.columns = ["x0f", "x1f"]
in_samp_forcast_exog = {"x0": forecast_x[["x0f"]], "x1": forecast_x[["x1f"]].shift(-1)}
fcast = res.forecast(horizon=1, x=in_samp_forcast_exog, start=0)
fcast.mean

[19]:

	h.1
0	21.875487
1	17.621295
2	16.854935
3	17.428938
4	17.296659
...	...
995	14.103425
996	10.379336
997	7.074845
998	6.694790
999	NaN

1000 rows × 1 columns

True out-of-sample forecasts¶

In order to make a true out-of-sample prediction, we need the expected values of X from the end of the data we have. These can be constructed by forecasting the two \(X\) variables and then passing these values as x to forecast.

[20]:

mod = arch_model(y, x=exog, mean="ARX", lags=1)
res = mod.fit(disp="off")

actual_x_oos = {
    "x0": res0.forecast(horizon=10).mean,
    "x1": res1.forecast(horizon=10).mean,
}
fcasts = res.forecast(horizon=10, x=actual_x_oos)
fcasts.mean

[20]:

	h.01	h.02	h.03	h.04	h.05	h.06	h.07	h.08	h.09	h.10
999	7.377696	7.885298	8.392617	8.819117	9.150921	9.398454	9.578554	9.707518	9.798886	9.863133

Forecasting with Exogenous Regressors¶

Simulating data¶

Plotting the data¶

Forecasting the X values¶

Fitting the model¶

Using a dict¶

Using DataFrame¶

Using an array¶

Producing multiple forecasts¶

x input array sizes¶

Special Cases with a single x variable¶

Simulation¶

Static out-of-sample x¶

Simulating the out-of-sample x¶

Conditional Mean Alignment vs. Forecast Alignment¶

Conditional Mean uses target alignment¶

forecast uses origin alignment¶

(Nearly) out-of-sample forecasts¶

True out-of-sample forecasts¶

Using a `dict`¶

Using `DataFrame`¶

Using an `array`¶

`x` input array sizes¶

Special Cases with a single `x` variable¶

Static out-of-sample `x`¶

Simulating the out-of-sample `x`¶

`forecast` uses origin alignment¶