Using Formulas¶

The asset pricing model estimators all all formulas. Since the models have multiple dependent variables (test portfolios) as well as multiple independent variables (factors), the standard formulaic syntax needs to be modified. There are two methods to use formulas. The first specified both the test portfolio and the factors. The second specifies only the factors and the test portfolios are passed using an optional keyword argument. The second syntax exists since in many models the number of test portfolios might be large and interest is usually in modifying the factors.

Available Syntax¶

Test Portfolios and Factors¶

The first syntax can be expressed as

"port1 + port2 + port3 + port4 + ... + portN ~ factor1 + ... + factorK"

so that both the test portfolios and the factors are separated using +. The two sets are separated using the usual separator between left-hand side and right-hand side variables, ~.

Factors Only¶

The second syntax specifies only factors and uses the keyword argument portfolios to pass the matrix of portfolio returns.

formula = "factor1 + ... + factorK"
LinearFactorModel.from_formula(formula, portfolios=portfolios)

Import data and transform to be excess returns¶

The data used comes from Ken French”s website and includes 4 factor returns, the excess market, the size factor, the value factor and the momentum factor. The available test portfolios include the 12 industry portfolios, a subset of the size-value two way sort, and a subset of the size-momentum two way sort.

[1]:

from linearmodels.datasets import french

data = french.load()
print(french.DESCR)
data.iloc[:, 6:] = data.iloc[:, 6:].values - data[["RF"]].values


Data from Ken French's data library
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

dates    Year and Month of Return
MktRF    Market Factor
SMB      Size Factor
HML      Value Factor
Mom      Momentum Factor
RF       Risk-free rate
NoDur    Industry: Non-durables
Durbl    Industry: Durables
Manuf    Industry: Manufacturing
Enrgy    Industry: Energy
Chems    Industry: Chemicals
BusEq    Industry: Business Equipment
Telcm    Industry: Telecoms
Utils    Industry: Utilities
Shops    Industry: Retail
Hlth     Industry: Health care
Money    Industry: Finance
Other    Industry: Other
S1V1     Small firms, low value
S1V3     Small firms, medium value
S1V5     Small firms, high value
S3V1     Size 3, value 1
S3V3     Size 3, value 3
S3V5     Size 3, value 5
S5V1     Large firms, Low value
S5V3     Large firms, medium value
S5V5     Large Firms, High value
S1M1     Small firms, losers
S1M3     Small firms, neutral
S1M5     Small firms, winners
S3M1     Size 3, momentum 1
S3M3     Size 3, momentum 3
S3M5     Size 3, momentum 5
S5M1     Large firms, losers
S5M3     Large firms, neutral
S5M5     Large firms, winners

First Syntax¶

This example shows the first syntax. The test portfolios are a combination of the industry, size-value, and size-momentum sorted portfolios. The factors are the market, value and momentum factors. This model is not adequate to price the assets.

[2]:

from linearmodels.asset_pricing import LinearFactorModel

formula = "NoDur + Chems + S1V1 + S5V5 + S1M1 + S5M5 ~ MktRF + HML + Mom"
mod = LinearFactorModel.from_formula(formula, data)
res = mod.fit(cov_type="kernel", kernel="parzen", bandwidth=20)
print(res)

                      LinearFactorModel Estimation Summary
================================================================================
No. Test Portfolios:                  6   R-squared:                      0.7229
No. Factors:                          3   J-statistic:                    9.9450
No. Observations:                   819   P-value                         0.0190
Date:                  Mon, Nov 24 2025   Distribution:                  chi2(3)
Time:                          14:41:44
Cov. Estimator:                  kernel

                            Risk Premia Estimates
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
MktRF          0.0056     0.0016     3.4051     0.0007      0.0024      0.0088
HML            0.0044     0.0016     2.6929     0.0071      0.0012      0.0075
Mom            0.0081     0.0021     3.7457     0.0002      0.0038      0.0123
==============================================================================

Covariance estimator:
KernelCovariance, Kernel: parzen, Bandwidth: 20.0
See full_summary for complete results

Second Syntax¶

The second syntax contains only the factors and omits the test portfolios. The test portfolios are passed as an array or DataFrame using a keyword input. This syntax simplifies experimenting with alternative factors when there are many test portfolios. This model also appears to be inadequate, even allowing the risk-free rate to be a free parameter.

[3]:

ports = ["S{0}V{1}".format(i, j) for i in (1, 3, 5) for j in (1, 3, 5)]
ports += ["S{0}M{1}".format(i, j) for i in (1, 3, 5) for j in (1, 3, 5)]
portfolios = data[ports]
formula = "MktRF + HML + Mom"
mod = LinearFactorModel.from_formula(
    formula, data, portfolios=portfolios, risk_free=True
)
res = mod.fit()
print(res)

                      LinearFactorModel Estimation Summary
================================================================================
No. Test Portfolios:                 18   R-squared:                      0.7723
No. Factors:                          3   J-statistic:                    86.386
No. Observations:                   819   P-value                         0.0000
Date:                  Mon, Nov 24 2025   Distribution:                 chi2(14)
Time:                          14:41:44
Cov. Estimator:                  robust

                            Risk Premia Estimates
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
risk_free      0.0024     0.0040     0.5940     0.5525     -0.0055      0.0103
MktRF          0.0050     0.0044     1.1337     0.2569     -0.0036      0.0136
HML            0.0042     0.0010     4.0918     0.0000      0.0022      0.0063
Mom            0.0081     0.0014     5.7058     0.0000      0.0053      0.0109
==============================================================================

Covariance estimator:
HeteroskedasticCovariance
See full_summary for complete results

Comparing results¶

To verify the results, the model is estimated using the standard interface. The J-statistic and \(R^2\) are identical.

[4]:

portfolios = data[ports]
factors = data[["MktRF", "HML", "Mom"]]
mod = LinearFactorModel(portfolios, factors, risk_free=True)
print(mod.fit())

                      LinearFactorModel Estimation Summary
================================================================================
No. Test Portfolios:                 18   R-squared:                      0.7723
No. Factors:                          3   J-statistic:                    86.386
No. Observations:                   819   P-value                         0.0000
Date:                  Mon, Nov 24 2025   Distribution:                 chi2(14)
Time:                          14:41:44
Cov. Estimator:                  robust

                            Risk Premia Estimates
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
risk_free      0.0024     0.0040     0.5940     0.5525     -0.0055      0.0103
MktRF          0.0050     0.0044     1.1337     0.2569     -0.0036      0.0136
HML            0.0042     0.0010     4.0918     0.0000      0.0022      0.0063
Mom            0.0081     0.0014     5.7058     0.0000      0.0053      0.0109
==============================================================================

Covariance estimator:
HeteroskedasticCovariance
See full_summary for complete results