System Regression Estimators¶
Seemingly Unrelated Regression (SUR/SURE)¶
 class
SUR
(equations, *, sigma=None)[source]¶ Seemingly unrelated regression estimation (SUR/SURE)
 Parameters
equations (dict) – Dictionarylike structure containing dependent and exogenous variable values. Each key is an equations label and must be a string. Each value must be either a tuple of the form (dependent, exog, [weights]) or a dictionary with keys ‘dependent’ and ‘exog’ and the optional key ‘weights’.
sigma (arraylike) – Prespecified residual covariance to use in GLS estimation. If not provided, FGLS is implemented based on an estimate of sigma.
Notes
Estimates a set of regressions which are seemingly unrelated in the sense that separate estimation would lead to consistent parameter estimates. Each equation is of the form
\[y_{i,k} = x_{i,k}\beta_i + \epsilon_{i,k}\]where k denotes the equation and i denoted the observation index. By stacking vertically arrays of dependent and placing the exogenous variables into a block diagonal array, the entire system can be compactly expressed as
\[Y = X\beta + \epsilon\]where
\[\begin{split}Y = \left[\begin{array}{x}Y_1 \\ Y_2 \\ \vdots \\ Y_K\end{array}\right]\end{split}\]and
\[\begin{split}X = \left[\begin{array}{cccc} X_1 & 0 & \ldots & 0 \\ 0 & X_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & X_K \end{array}\right]\end{split}\]The system OLS estimator is
\[\hat{\beta}_{OLS} = (X'X)^{1}X'Y\]When certain conditions are satisfied, a GLS estimator of the form
\[\hat{\beta}_{GLS} = (X'\Omega^{1}X)^{1}X'\Omega^{1}Y\]can improve accuracy of coefficient estimates where
\[\Omega = \Sigma \otimes I_N\]where \(\Sigma\) is the covariance matrix of the residuals.
SUR is a special case of 3SLS where there are no endogenous regressors and no instruments.

add_constraints
(r, q=None)¶  Parameters
r (DataFrame) – Constraint matrix. nconstraints by nparameters
q (Series, optional) – Constraint values (nconstraints). If not set, set to 0
Notes
Constraints are of the form
\[r \beta = q\]The property param_names can be used to determine the order of parameters.
 property
constraints
¶ Model constraints
 Returns
cons – Constraint object
 Return type
LinearConstraint

fit
(*, method=None, full_cov=True, iterate=False, iter_limit=100, tol=1e06, cov_type='robust', **cov_config)¶ Estimate model parameters
 Parameters
method ({None, 'gls', 'ols'}) – Estimation method. Default auto selects based on regressors, using OLS only if all regressors are identical. The other two arguments force the use of GLS or OLS.
full_cov (bool) – Flag indicating whether to utilize information in correlations when estimating the model with GLS
iterate (bool) – Flag indicating to iterate GLS until convergence of iter limit iterations have been completed
iter_limit (int) – Maximum number of iterations for iterative GLS
tol (float) – Tolerance to use when checking for convergence in iterative GLS
cov_type (str) –
Name of covariance estimator. Valid options are
’unadjusted’, ‘homoskedastic’  Classic covariance estimator
’robust’, ‘heteroskedastic’  Heteroskedasticity robust covariance estimator
’kernel’  Allows for heteroskedasticity and autocorrelation
**cov_config – Additional parameters to pass to covariance estimator. All estimators support debiased which employs a smallsample adjustment
 Returns
results – Estimation results
 Return type
 property
formula
¶ Set or get the formula used to construct the model
 classmethod
from_formula
(formula, data, *, sigma=None, weights=None)[source]¶ Specify a SUR using the formula interface
 Parameters
formula ({str, dictlike}) – Either a string or a dictionary of strings where each value in the dictionary represents a single equation. See Notes for a description of the accepted syntax
data (DataFrame) – Frame containing named variables
sigma (arraylike) – Prespecified residual covariance to use in GLS estimation. If not provided, FGLS is implemented based on an estimate of sigma.
weights (dictlike) – Dictionary like object (e.g. a DataFrame) containing variable weights. Each entry must have the same number of observations as data. If an equation label is not a key weights, the weights will be set to unity
 Returns
model – Model instance
 Return type
Notes
Models can be specified in one of two ways. The first uses curly braces to encapsulate equations. The second uses a dictionary where each key is an equation name.
Examples
The simplest format uses standard Patsy formulas for each equation in a dictionary. Best practice is to use an Ordered Dictionary
>>> import pandas as pd >>> import numpy as np >>> data = pd.DataFrame(np.random.randn(500, 4), columns=['y1', 'x1_1', 'y2', 'x2_1']) >>> from linearmodels.system import SUR >>> formula = {'eq1': 'y1 ~ 1 + x1_1', 'eq2': 'y2 ~ 1 + x2_1'} >>> mod = SUR.from_formula(formula, data)
The second format uses curly braces {} to surround distinct equations
>>> formula = '{y1 ~ 1 + x1_1} {y2 ~ 1 + x2_1}' >>> mod = SUR.from_formula(formula, data)
It is also possible to include equation labels when using curly braces
>>> formula = '{eq1: y1 ~ 1 + x1_1} {eq2: y2 ~ 1 + x2_1}' >>> mod = SUR.from_formula(formula, data)
 property
has_constant
¶ Vector indicating which equations contain constants
 classmethod
multivariate_ls
(dependent, exog)[source]¶ Interface for specification of multivariate regression models
 Parameters
dependent (arraylike) – nobs by ndep array of dependent variables
exog (arraylike) – nobs by nvar array of exogenous regressors common to all models
 Returns
model – Model instance
 Return type
Notes
Utility function to simplify the construction of multivariate regression models which all use the same regressors. Constructs the dictionary of equations from the variables using the common exogenous variable.
Examples
A simple CAPM can be estimated as a multivariate regression
>>> from linearmodels.datasets import french >>> from linearmodels.system import SUR >>> data = french.load() >>> portfolios = data[['S1V1','S1V5','S5V1','S5V5']] >>> factors = data[['MktRF']].copy() >>> factors['alpha'] = 1 >>> mod = SUR.multivariate_ls(portfolios, factors)
 property
param_names
¶ Model parameter names

predict
(params, *, equations=None, data=None, eval_env=8)¶ Predict values for additional data
 Parameters
params (arraylike) – Model parameters (nvar by 1)
equations (dict) – Dictionarylike structure containing exogenous and endogenous variables. Each key is an equations label and must match the labels used to fir the model. Each value must be either a tuple of the form (exog, endog) or a dictionary with keys ‘exog’ and ‘endog’. If predictions are not required for one of more of the model equations, these keys can be omitted.
data (DataFrame) – Values to use when making predictions from a model constructed from a formula
eval_env (int) – Depth of use when evaluating formulas using Patsy.
 Returns
predictions – Fitted values from supplied data and parameters
 Return type
DataFrame
Notes
If data is not none, then equations must be none. Predictions from models constructed using formulas can be computed using either equations, which will treat these are arrays of values corresponding to the formulaprocess data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.
When using exog and endog, the regressor array for a particular equation is assembled as [equations[eqn][‘exog’], equations[eqn][‘endog’]] where eqn is an equation label. These must correspond to the columns in the estimated model.

reset_constraints
()¶ Remove all model constraints
ThreeStage Least Squares (3SLS)¶
 class
IV3SLS
(equations, *, sigma=None)[source]¶ Threestage Least Squares (3SLS) Estimator
 Parameters
equations (dict) – Dictionarylike structure containing dependent, exogenous, endogenous and instrumental variables. Each key is an equations label and must be a string. Each value must be either a tuple of the form (dependent, exog, endog, instrument[, weights]) or a dictionary with keys ‘dependent’, and at least one of ‘exog’ or ‘endog’ and ‘instruments’. When using a tuple, values must be provided for all 4 variables, although either empty arrays or None can be passed if a category of variable is not included in a model. The dictionary may contain optional keys for ‘exog’, ‘endog’, ‘instruments’, and ‘weights’. ‘exog’ can be omitted if all variables in an equation are endogenous. Alternatively, ‘exog’ can contain either an empty array or None to indicate that an equation contains no exogenous regressors. Similarly ‘endog’ and ‘instruments’ can either be omitted or may contain an empty array (or None) if all variables in an equation are exogenous.
sigma (arraylike) – Prespecified residual covariance to use in GLS estimation. If not provided, FGLS is implemented based on an estimate of sigma.
Notes
Estimates a set of regressions which are seemingly unrelated in the sense that separate estimation would lead to consistent parameter estimates. Each equation is of the form
\[y_{i,k} = x_{i,k}\beta_i + \epsilon_{i,k}\]where k denotes the equation and i denoted the observation index. By stacking vertically arrays of dependent and placing the exogenous variables into a block diagonal array, the entire system can be compactly expressed as
\[Y = X\beta + \epsilon\]where
\[\begin{split}Y = \left[\begin{array}{x}Y_1 \\ Y_2 \\ \vdots \\ Y_K\end{array}\right]\end{split}\]and
\[\begin{split}X = \left[\begin{array}{cccc} X_1 & 0 & \ldots & 0 \\ 0 & X_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & X_K \end{array}\right]\end{split}\]The system instrumental variable (IV) estimator is
\[\begin{split}\hat{\beta}_{IV} & = (X'Z(Z'Z)^{1}Z'X)^{1}X'Z(Z'Z)^{1}Z'Y \\ & = (\hat{X}'\hat{X})^{1}\hat{X}'Y\end{split}\]where \(\hat{X} = Z(Z'Z)^{1}Z'X\) and. When certain conditions are satisfied, a GLS estimator of the form
\[\hat{\beta}_{3SLS} = (\hat{X}'\Omega^{1}\hat{X})^{1}\hat{X}'\Omega^{1}Y\]can improve accuracy of coefficient estimates where
\[\Omega = \Sigma \otimes I_N\]where \(\Sigma\) is the covariance matrix of the residuals.

add_constraints
(r, q=None)[source]¶  Parameters
r (DataFrame) – Constraint matrix. nconstraints by nparameters
q (Series, optional) – Constraint values (nconstraints). If not set, set to 0
Notes
Constraints are of the form
\[r \beta = q\]The property param_names can be used to determine the order of parameters.
 property
constraints
¶ Model constraints
 Returns
cons – Constraint object
 Return type
LinearConstraint

fit
(*, method=None, full_cov=True, iterate=False, iter_limit=100, tol=1e06, cov_type='robust', **cov_config)[source]¶ Estimate model parameters
 Parameters
method ({None, 'gls', 'ols'}) – Estimation method. Default auto selects based on regressors, using OLS only if all regressors are identical. The other two arguments force the use of GLS or OLS.
full_cov (bool) – Flag indicating whether to utilize information in correlations when estimating the model with GLS
iterate (bool) – Flag indicating to iterate GLS until convergence of iter limit iterations have been completed
iter_limit (int) – Maximum number of iterations for iterative GLS
tol (float) – Tolerance to use when checking for convergence in iterative GLS
cov_type (str) –
Name of covariance estimator. Valid options are
’unadjusted’, ‘homoskedastic’  Classic covariance estimator
’robust’, ‘heteroskedastic’  Heteroskedasticity robust covariance estimator
’kernel’  Allows for heteroskedasticity and autocorrelation
**cov_config – Additional parameters to pass to covariance estimator. All estimators support debiased which employs a smallsample adjustment
 Returns
results – Estimation results
 Return type
 property
formula
¶ Set or get the formula used to construct the model
 classmethod
from_formula
(formula, data, *, sigma=None, weights=None)[source]¶ Specify a 3SLS using the formula interface
 Parameters
formula ({str, dictlike}) – Either a string or a dictionary of strings where each value in the dictionary represents a single equation. See Notes for a description of the accepted syntax
data (DataFrame) – Frame containing named variables
sigma (arraylike) – Prespecified residual covariance to use in GLS estimation. If not provided, FGLS is implemented based on an estimate of sigma.
weights (dictlike) – Dictionary like object (e.g. a DataFrame) containing variable weights. Each entry must have the same number of observations as data. If an equation label is not a key weights, the weights will be set to unity
 Returns
model – Model instance
 Return type
Notes
Models can be specified in one of two ways. The first uses curly braces to encapsulate equations. The second uses a dictionary where each key is an equation name.
Examples
The simplest format uses standard Patsy formulas for each equation in a dictionary. Best practice is to use an Ordered Dictionary
>>> import pandas as pd >>> import numpy as np >>> cols = ['y1', 'x1_1', 'x1_2', 'z1', 'y2', 'x2_1', 'x2_2', 'z2'] >>> data = pd.DataFrame(np.random.randn(500, 8), columns=cols) >>> from linearmodels.system import IV3SLS >>> formula = {'eq1': 'y1 ~ 1 + x1_1 + [x1_2 ~ z1]', ... 'eq2': 'y2 ~ 1 + x2_1 + [x2_2 ~ z2]'} >>> mod = IV3SLS.from_formula(formula, data)
The second format uses curly braces {} to surround distinct equations
>>> formula = '{y1 ~ 1 + x1_1 + [x1_2 ~ z1]} {y2 ~ 1 + x2_1 + [x2_2 ~ z2]}' >>> mod = IV3SLS.from_formula(formula, data)
It is also possible to include equation labels when using curly braces
>>> formula = '{eq1: y1 ~ 1 + x1_1 + [x1_2 ~ z1]} {eq2: y2 ~ 1 + x2_1 + [x2_2 ~ z2]}' >>> mod = IV3SLS.from_formula(formula, data)
 property
has_constant
¶ Vector indicating which equations contain constants
 classmethod
multivariate_ls
(dependent, exog=None, endog=None, instruments=None)[source]¶ Interface for specification of multivariate IV models
 Parameters
dependent (arraylike) – nobs by ndep array of dependent variables
exog (arraylike, optional) – nobs by nexog array of exogenous regressors common to all models
endog (arraylike, optional) – nobs by nendog array of endogenous regressors common to all models
instruments (arraylike, optional) – nobs by ninstr array of instruments to use in all equations
 Returns
model – Model instance
 Return type
Notes
At least one of exog or endog must be provided.
Utility function to simplify the construction of multivariate IV models which all use the same regressors and instruments. Constructs the dictionary of equations from the variables using the common exogenous, endogenous and instrumental variables.
 property
param_names
¶ Model parameter names

predict
(params, *, equations=None, data=None, eval_env=8)[source]¶ Predict values for additional data
 Parameters
params (arraylike) – Model parameters (nvar by 1)
equations (dict) – Dictionarylike structure containing exogenous and endogenous variables. Each key is an equations label and must match the labels used to fir the model. Each value must be either a tuple of the form (exog, endog) or a dictionary with keys ‘exog’ and ‘endog’. If predictions are not required for one of more of the model equations, these keys can be omitted.
data (DataFrame) – Values to use when making predictions from a model constructed from a formula
eval_env (int) – Depth of use when evaluating formulas using Patsy.
 Returns
predictions – Fitted values from supplied data and parameters
 Return type
DataFrame
Notes
If data is not none, then equations must be none. Predictions from models constructed using formulas can be computed using either equations, which will treat these are arrays of values corresponding to the formulaprocess data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.
When using exog and endog, the regressor array for a particular equation is assembled as [equations[eqn][‘exog’], equations[eqn][‘endog’]] where eqn is an equation label. These must correspond to the columns in the estimated model.
Generalized Method of Moments (GMM) Estimation of Systems¶
 class
IVSystemGMM
(equations, *, sigma=None, weight_type='robust', **weight_config)[source]¶ System Generalized Method of Moments (GMM) estimation of linear IV models
 Parameters
equations (dict) – Dictionarylike structure containing dependent, exogenous, endogenous and instrumental variables. Each key is an equations label and must be a string. Each value must be either a tuple of the form (dependent, exog, endog, instrument[, weights]) or a dictionary with keys ‘dependent’, ‘exog’. The dictionary may contain optional keys for ‘endog’, ‘instruments’, and ‘weights’. Endogenous and/or Instrument can be empty if all variables in an equation are exogenous.
sigma (arraylike) – Prespecified residual covariance to use in GLS estimation. If not provided, FGLS is implemented based on an estimate of sigma. Only used if weight_type is ‘unadjusted’
weight_type (str) – Name of moment condition weight function to use in the GMM estimation
**weight_config – Additional keyword arguments to pass to the moment condition weight function
Notes
Estimates a linear model using GMM. Each equation is of the form
\[y_{i,k} = x_{i,k}\beta_i + \epsilon_{i,k}\]where k denotes the equation and i denoted the observation index. By stacking vertically arrays of dependent and placing the exogenous variables into a block diagonal array, the entire system can be compactly expressed as
\[Y = X\beta + \epsilon\]where
\[\begin{split}Y = \left[\begin{array}{x}Y_1 \\ Y_2 \\ \vdots \\ Y_K\end{array}\right]\end{split}\]and
\[\begin{split}X = \left[\begin{array}{cccc} X_1 & 0 & \ldots & 0 \\ 0 & X_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & X_K \end{array}\right]\end{split}\]The system GMM estimator uses the moment condition
\[z_{ij}(y_{ij}  x_{ij}\beta_j) = 0\]where j indexes the equation. The estimator for the coefficients is given by
\[\begin{split}\hat{\beta}_{GMM} & = (X'ZW^{1}Z'X)^{1}X'ZW^{1}Z'Y \\\end{split}\]where \(W\) is a positive definite weighting matrix.

add_constraints
(r, q=None)¶  Parameters
r (DataFrame) – Constraint matrix. nconstraints by nparameters
q (Series, optional) – Constraint values (nconstraints). If not set, set to 0
Notes
Constraints are of the form
\[r \beta = q\]The property param_names can be used to determine the order of parameters.
 property
constraints
¶ Model constraints
 Returns
cons – Constraint object
 Return type
LinearConstraint

fit
(*, iter_limit=2, tol=1e06, initial_weight=None, cov_type='robust', **cov_config)[source]¶ Estimate model parameters
 Parameters
iter_limit (int) – Maximum number of iterations for iterative GLS
tol (float) – Tolerance to use when checking for convergence in iterative GLS
initial_weight (ndarray, optional) – Initial weighting matrix to use in the first step. If not specified, uses the average outerproduct of the set containing the exogenous variables and instruments.
cov_type (str) –
Name of covariance estimator. Valid options are
’unadjusted’, ‘homoskedastic’  Classic covariance estimator
’robust’, ‘heteroskedastic’  Heteroskedasticity robust covariance estimator
**cov_config – Additional parameters to pass to covariance estimator. All estimators support debiased which employs a smallsample adjustment
 Returns
results – Estimation results
 Return type
 property
formula
¶ Set or get the formula used to construct the model
 classmethod
from_formula
(formula, data, *, weights=None, weight_type='robust', **weight_config)[source]¶ Specify a 3SLS using the formula interface
 Parameters
formula ({str, dictlike}) – Either a string or a dictionary of strings where each value in the dictionary represents a single equation. See Notes for a description of the accepted syntax
data (DataFrame) – Frame containing named variables
weights (dictlike) – Dictionary like object (e.g. a DataFrame) containing variable weights. Each entry must have the same number of observations as data. If an equation label is not a key weights, the weights will be set to unity
weight_type (str) –
Name of moment condition weight function to use in the GMM estimation. Valid options are:
’unadjusted’, ‘homoskedastic’  Assume moments are homoskedastic
’robust’, ‘heteroskedastic’  Allow for heteroskedasticity
**weight_config – Additional keyword arguments to pass to the moment condition weight function
 Returns
model – Model instance
 Return type
Notes
Models can be specified in one of two ways. The first uses curly braces to encapsulate equations. The second uses a dictionary where each key is an equation name.
Examples
The simplest format uses standard Patsy formulas for each equation in a dictionary. Best practice is to use an Ordered Dictionary
>>> import pandas as pd >>> import numpy as np >>> cols = ['y1', 'x1_1', 'x1_2', 'z1', 'y2', 'x2_1', 'x2_2', 'z2'] >>> data = pd.DataFrame(np.random.randn(500, 8), columns=cols) >>> from linearmodels.system import IVSystemGMM >>> formula = {'eq1': 'y1 ~ 1 + x1_1 + [x1_2 ~ z1]', ... 'eq2': 'y2 ~ 1 + x2_1 + [x2_2 ~ z2]'} >>> mod = IVSystemGMM.from_formula(formula, data)
The second format uses curly braces {} to surround distinct equations
>>> formula = '{y1 ~ 1 + x1_1 + [x1_2 ~ z1]} {y2 ~ 1 + x2_1 + [x2_2 ~ z2]}' >>> mod = IVSystemGMM.from_formula(formula, data)
It is also possible to include equation labels when using curly braces
>>> formula = '{eq1: y1 ~ 1 + x1_1 + [x1_2 ~ z1]} {eq2: y2 ~ 1 + x2_1 + [x2_2 ~ z2]}' >>> mod = IVSystemGMM.from_formula(formula, data)
 property
has_constant
¶ Vector indicating which equations contain constants
 classmethod
multivariate_ls
(dependent, exog=None, endog=None, instruments=None)¶ Interface for specification of multivariate IV models
 Parameters
dependent (arraylike) – nobs by ndep array of dependent variables
exog (arraylike, optional) – nobs by nexog array of exogenous regressors common to all models
endog (arraylike, optional) – nobs by nendog array of endogenous regressors common to all models
instruments (arraylike, optional) – nobs by ninstr array of instruments to use in all equations
 Returns
model – Model instance
 Return type
Notes
At least one of exog or endog must be provided.
Utility function to simplify the construction of multivariate IV models which all use the same regressors and instruments. Constructs the dictionary of equations from the variables using the common exogenous, endogenous and instrumental variables.
 property
param_names
¶ Model parameter names

predict
(params, *, equations=None, data=None, eval_env=8)¶ Predict values for additional data
 Parameters
params (arraylike) – Model parameters (nvar by 1)
equations (dict) – Dictionarylike structure containing exogenous and endogenous variables. Each key is an equations label and must match the labels used to fir the model. Each value must be either a tuple of the form (exog, endog) or a dictionary with keys ‘exog’ and ‘endog’. If predictions are not required for one of more of the model equations, these keys can be omitted.
data (DataFrame) – Values to use when making predictions from a model constructed from a formula
eval_env (int) – Depth of use when evaluating formulas using Patsy.
 Returns
predictions – Fitted values from supplied data and parameters
 Return type
DataFrame
Notes
If data is not none, then equations must be none. Predictions from models constructed using formulas can be computed using either equations, which will treat these are arrays of values corresponding to the formulaprocess data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.
When using exog and endog, the regressor array for a particular equation is assembled as [equations[eqn][‘exog’], equations[eqn][‘endog’]] where eqn is an equation label. These must correspond to the columns in the estimated model.

reset_constraints
()¶ Remove all model constraints