Models for Panel Data¶
Fixed Effect Estimation¶
 class
PanelOLS
(dependent, exog, *, weights=None, entity_effects=False, time_effects=False, other_effects=None, singletons=True, drop_absorbed=False)[source]¶ One and twoway fixed effects estimator for panel data
 Parameters
dependent (arraylike) – Dependent (lefthandside) variable (time by entity).
exog (arraylike) – Exogenous or righthandside variables (variable by time by entity).
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.
entity_effects (bool, optional) – Flag whether to include entity (fixed) effects in the model
time_effects (bool, optional) – Flag whether to include time effects in the model
other_effects (arraylike, optional) – Category codes to use for any effects that are not entity or time effects. Each variable is treated as an effect.
singletons (bool, optional) – Flag indicating whether to drop singleton observation
drop_absorbed (bool, optional) – Flag indicating whether to drop absorbed variables
Notes
Many models can be estimated. The most common included entity effects and can be described
\[y_{it} = \alpha_i + \beta^{\prime}x_{it} + \epsilon_{it}\]where \(\alpha_i\) is included if
entity_effects=True
.Time effect are also supported, which leads to a model of the form
\[y_{it}= \gamma_t + \beta^{\prime}x_{it} + \epsilon_{it}\]where \(\gamma_i\) is included if
time_effects=True
.Both effects can be simultaneously used,
\[y_{it}=\alpha_i + \gamma_t + \beta^{\prime}x_{it} + \epsilon_{it}\]Additionally , arbitrary effects can be specified using categorical variables.
If both
entity_effect
and``time_effects`` areFalse
, and no other effects are included, the model reduces toPooledOLS
.Model supports at most 2 effects. These can be entitytime, entityother, timeother or 2 other.
 property
entity_effects
¶ Flag indicating whether entity effects are included

fit
(*, use_lsdv=False, use_lsmr=False, low_memory=None, cov_type='unadjusted', debiased=True, auto_df=True, count_effects=True, **cov_config)[source]¶ Estimate model parameters
 Parameters
use_lsdv (bool, optional) – Flag indicating to use the Least Squares Dummy Variable estimator to eliminate effects. The default value uses only means and does note require constructing dummy variables for each effect.
use_lsmr (bool, optional) – Flag indicating to use LSDV with the Sparse Equations and Least Squares estimator to eliminate the fixed effects.
low_memory ({bool, None}) – Flag indicating whether to use a lowmemory algorithm when a model contains twoway fixed effects. If None, the choice is taken automatically, and the low memory algorithm is used if the required dummy variable array is both larger than then array of regressors in the model and requires more than 1 GiB .
cov_type (str, optional) – Name of covariance estimator. See Notes.
debiased (bool, optional) – Flag indicating whether to debiased the covariance estimator using a degree of freedom adjustment.
auto_df (bool, optional) – Flag indicating that the treatment of estimated effects in degree of freedom adjustment is automatically handled. This is useful since clustered standard errors that are clustered using the same variable as an effect do not require degree of freedom correction while other estimators such as the unadjusted covariance do.
count_effects (bool, optional) – Flag indicating that the covariance estimator should be adjusted to account for the estimation of effects in the model. Only used if
auto_df=False
.**cov_config – Additional covariancespecific options. See Notes.
 Returns
results – Estimation results
 Return type
Examples
>>> from linearmodels import PanelOLS >>> mod = PanelOLS(y, x, entity_effects=True) >>> res = mod.fit(cov_type='clustered', cluster_entity=True)
Notes
Three covariance estimators are supported:
‘unadjusted’, ‘homoskedastic’  Assume residual are homoskedastic
‘robust’, ‘heteroskedastic’  Control for heteroskedasticity using White’s estimator
‘clustered`  One or two way clustering. Configuration options are:
clusters
 Input containing containing 1 or 2 variables. Clusters should be integer valued, although other types will be coerced to integer values by treating as categorical variablescluster_entity
 Boolean flag indicating to use entity clusterscluster_time
 Boolean indicating to use time clusters
‘kernel’  DriscollKraay HAC estimator. Configurations options are:
kernel
 One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is produces a covariance estimator similar to the NeweyWest covariance estimator.bandwidth
 Bandwidth to use when computing the kernel. If not provided, a naive default is used.
 property
formula
¶ Formula used to construct the model
 classmethod
from_formula
(formula, data, *, weights=None, other_effects=None, singletons=True, drop_absorbed=False)[source]¶ Create a model from a formula
 Parameters
formula (str) – Formula to transform into model. Conforms to patsy formula rules with two special variable names, EntityEffects and TimeEffects which can be used to specify that the model should contain an entity effect or a time effect, respectively. See Examples.
data (arraylike) – Data structure that can be coerced into a PanelData. In most cases, this should be a multiindex DataFrame where the level 0 index contains the entities and the level 1 contains the time.
weights (arraylike) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.
other_effects (arraylike, optional) – Category codes to use for any effects that are not entity or time effects. Each variable is treated as an effect.
singletons (bool, optional) – Flag indicating whether to drop singleton observation
drop_absorbed (bool, optional) – Flag indicating whether to drop absorbed variables
 Returns
model – Model specified using the formula
 Return type
Examples
>>> from linearmodels import PanelOLS >>> mod = PanelOLS.from_formula('y ~ 1 + x1 + EntityEffects', panel_data) >>> res = mod.fit(cov_type='clustered', cluster_entity=True)
 property
has_constant
¶ Flag indicating the model a constant or implicit constant
 property
not_null
¶ Locations of nonmissing observations
 property
other_effects
¶ Flag indicating whether other (generic) effects are included

predict
(params, *, exog=None, data=None, eval_env=4)¶ Predict values for additional data
 Parameters
params (arraylike) – Model parameters (nvar by 1)
exog (arraylike) – Exogenous regressors (nobs by nvar)
data (DataFrame) – Values to use when making predictions from a model constructed from a formula
eval_env (int) – Depth of use when evaluating formulas using Patsy.
 Returns
predictions – Fitted values from supplied data and parameters
 Return type
DataFrame
Notes
If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formulaprocessed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters
(clusters)¶ Reformat cluster variables
 Parameters
clusters (arraylike) – Values to use for variance clustering
 Returns
reformatted – Original data with matching axis and observation dropped where missing in the model data.
 Return type
Notes
This is exposed for testing and is not normally needed for estimation
 property
time_effects
¶ Flag indicating whether time effects are included
Random Effects¶
 class
RandomEffects
(dependent, exog, *, weights=None)[source]¶ Oneway Random Effects model for panel data
 Parameters
dependent (arraylike) – Dependent (lefthandside) variable (time by entity)
exog (arraylike) – Exogenous or righthandside variables (variable by time by entity).
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.
Notes
The model is given by
\[y_{it} = \beta^{\prime}x_{it} + u_i + \epsilon_{it}\]where \(u_i\) is a shock that is independent of \(x_{it}\) but common to all entities i.

fit
(*, small_sample=False, cov_type='unadjusted', debiased=True, **cov_config)[source]¶ Estimate model parameters
 Parameters
 Returns
results – Estimation results
 Return type
Examples
>>> from linearmodels import PooledOLS >>> mod = PooledOLS(y, x) >>> res = mod.fit(cov_type='clustered', cluster_entity=True)
Notes
Four covariance estimators are supported:
‘unadjusted’, ‘homoskedastic’  Assume residual are homoskedastic
‘robust’, ‘heteroskedastic’  Control for heteroskedasticity using White’s estimator
‘clustered`  One or two way clustering. Configuration options are:
clusters
 Input containing containing 1 or 2 variables. Clusters should be integer values, although other types will be coerced to integer values by treating as categorical variablescluster_entity
 Boolean flag indicating to use entity clusterscluster_time
 Boolean indicating to use time clusters
‘kernel’  DriscollKraay HAC estimator. Configurations options are:
kernel
 One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is produces a covariance estimator similar to the NeweyWest covariance estimator.bandwidth
 Bandwidth to use when computing the kernel. If not provided, a naive default is used.
 property
formula
¶ Formula used to construct the model
 classmethod
from_formula
(formula, data, *, weights=None)[source]¶ Create a model from a formula
 Parameters
formula (str) – Formula to transform into model. Conforms to patsy formula rules.
data (arraylike) – Data structure that can be coerced into a PanelData. In most cases, this should be a multiindex DataFrame where the level 0 index contains the entities and the level 1 contains the time.
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.
 Returns
model – Model specified using the formula
 Return type
Notes
Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)
Examples
>>> from linearmodels import RandomEffects >>> mod = RandomEffects.from_formula('y ~ 1 + x1', panel_data) >>> res = mod.fit()
 property
has_constant
¶ Flag indicating the model a constant or implicit constant
 property
not_null
¶ Locations of nonmissing observations

predict
(params, *, exog=None, data=None, eval_env=4)¶ Predict values for additional data
 Parameters
params (arraylike) – Model parameters (nvar by 1)
exog (arraylike) – Exogenous regressors (nobs by nvar)
data (DataFrame) – Values to use when making predictions from a model constructed from a formula
eval_env (int) – Depth of use when evaluating formulas using Patsy.
 Returns
predictions – Fitted values from supplied data and parameters
 Return type
DataFrame
Notes
If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formulaprocessed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters
(clusters)¶ Reformat cluster variables
 Parameters
clusters (arraylike) – Values to use for variance clustering
 Returns
reformatted – Original data with matching axis and observation dropped where missing in the model data.
 Return type
Notes
This is exposed for testing and is not normally needed for estimation
Between OLS¶
 class
BetweenOLS
(dependent, exog, *, weights=None)[source]¶ Between estimator for panel data
 Parameters
dependent (arraylike) – Dependent (lefthandside) variable (time by entity)
exog (arraylike) – Exogenous or righthandside variables (variable by time by entity).
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.
Notes
The model is given by
\[\bar{y}_{i}= \beta^{\prime}\bar{x}_{i}+\bar{\epsilon}_{i}\]where \(\bar{z}\) is the timeaverage.

fit
(*, reweight=False, cov_type='unadjusted', debiased=True, **cov_config)[source]¶ Estimate model parameters
 Parameters
reweight (bool) – Flag indicating to reweight observations if the input data is unbalanced using a WLS estimator. If weights are provided, these are accounted for when reweighting. Has no effect on balanced data.
cov_type (str, optional) – Name of covariance estimator. See Notes.
debiased (bool, optional) – Flag indicating whether to debiased the covariance estimator using a degree of freedom adjustment.
**cov_config – Additional covariancespecific options. See Notes.
 Returns
results – Estimation results
 Return type
Examples
>>> from linearmodels import BetweenOLS >>> mod = BetweenOLS(y, x) >>> res = mod.fit(cov_type='robust')
Notes
Three covariance estimators are supported:
‘unadjusted’, ‘homoskedastic’  Assume residual are homoskedastic
‘robust’, ‘heteroskedastic’  Control for heteroskedasticity using White’s estimator
‘clustered`  One or two way clustering. Configuration options are:
clusters
 Input containing containing 1 or 2 variables. Clusters should be integer values, although other types will be coerced to integer values by treating as categorical variables
When using a clustered covariance estimator, all cluster ids must be identical within an entity.
 property
formula
¶ Formula used to construct the model
 classmethod
from_formula
(formula, data, *, weights=None)[source]¶ Create a model from a formula
 Parameters
formula (str) – Formula to transform into model. Conforms to patsy formula rules.
data (arraylike) – Data structure that can be coerced into a PanelData. In most cases, this should be a multiindex DataFrame where the level 0 index contains the entities and the level 1 contains the time.
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.
 Returns
model – Model specified using the formula
 Return type
Notes
Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)
Examples
>>> from linearmodels import BetweenOLS >>> mod = BetweenOLS.from_formula('y ~ 1 + x1', panel_data) >>> res = mod.fit()
 property
has_constant
¶ Flag indicating the model a constant or implicit constant
 property
not_null
¶ Locations of nonmissing observations

predict
(params, *, exog=None, data=None, eval_env=4)¶ Predict values for additional data
 Parameters
params (arraylike) – Model parameters (nvar by 1)
exog (arraylike) – Exogenous regressors (nobs by nvar)
data (DataFrame) – Values to use when making predictions from a model constructed from a formula
eval_env (int) – Depth of use when evaluating formulas using Patsy.
 Returns
predictions – Fitted values from supplied data and parameters
 Return type
DataFrame
Notes
If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formulaprocessed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters
(clusters)¶ Reformat cluster variables
 Parameters
clusters (arraylike) – Values to use for variance clustering
 Returns
reformatted – Original data with matching axis and observation dropped where missing in the model data.
 Return type
Notes
This is exposed for testing and is not normally needed for estimation
First Difference Estimation¶
 class
FirstDifferenceOLS
(dependent, exog, *, weights=None)[source]¶ First difference model for panel data
 Parameters
dependent (arraylike) – Dependent (lefthandside) variable (time by entity)
exog (arraylike) – Exogenous or righthandside variables (variable by time by entity).
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.
Notes
The model is given by
\[\Delta y_{it}=\beta^{\prime}\Delta x_{it}+\Delta\epsilon_{it}\]
fit
(*, cov_type='unadjusted', debiased=True, **cov_config)[source]¶ Estimate model parameters
 Parameters
 Returns
results – Estimation results
 Return type
Examples
>>> from linearmodels import FirstDifferenceOLS >>> mod = FirstDifferenceOLS(y, x) >>> res = mod.fit(cov_type='robust') >>> res = mod.fit(cov_type='clustered', cluster_entity=True)
Notes
Three covariance estimators are supported:
‘unadjusted’, ‘homoskedastic’  Assume residual are homoskedastic
‘robust’, ‘heteroskedastic’  Control for heteroskedasticity using White’s estimator
‘clustered`  One or two way clustering. Configuration options are:
clusters
 Input containing containing 1 or 2 variables. Clusters should be integer values, although other types will be coerced to integer values by treating as categorical variablescluster_entity
 Boolean flag indicating to use entity clusters
‘kernel’  DriscollKraay HAC estimator. Configurations options are:
kernel
 One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is produces a covariance estimator similar to the NeweyWest covariance estimator.bandwidth
 Bandwidth to use when computing the kernel. If not provided, a naive default is used.
When using a clustered covariance estimator, all cluster ids must be identical within a first difference. In most scenarios, this requires ids to be identical within an entity.
 property
formula
¶ Formula used to construct the model
 classmethod
from_formula
(formula, data, *, weights=None)[source]¶ Create a model from a formula
 Parameters
formula (str) – Formula to transform into model. Conforms to patsy formula rules.
data (arraylike) – Data structure that can be coerced into a PanelData. In most cases, this should be a multiindex DataFrame where the level 0 index contains the entities and the level 1 contains the time.
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.
 Returns
model – Model specified using the formula
 Return type
Notes
Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)
Examples
>>> from linearmodels import FirstDifferenceOLS >>> mod = FirstDifferenceOLS.from_formula('y ~ x1', panel_data) >>> res = mod.fit()
 property
has_constant
¶ Flag indicating the model a constant or implicit constant
 property
not_null
¶ Locations of nonmissing observations

predict
(params, *, exog=None, data=None, eval_env=4)¶ Predict values for additional data
 Parameters
params (arraylike) – Model parameters (nvar by 1)
exog (arraylike) – Exogenous regressors (nobs by nvar)
data (DataFrame) – Values to use when making predictions from a model constructed from a formula
eval_env (int) – Depth of use when evaluating formulas using Patsy.
 Returns
predictions – Fitted values from supplied data and parameters
 Return type
DataFrame
Notes
If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formulaprocessed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters
(clusters)¶ Reformat cluster variables
 Parameters
clusters (arraylike) – Values to use for variance clustering
 Returns
reformatted – Original data with matching axis and observation dropped where missing in the model data.
 Return type
Notes
This is exposed for testing and is not normally needed for estimation
Pooled OLS¶
 class
PooledOLS
(dependent, exog, *, weights=None)[source]¶ Pooled coefficient estimator for panel data
 Parameters
dependent (arraylike) – Dependent (lefthandside) variable (time by entity)
exog (arraylike) – Exogenous or righthandside variables (variable by time by entity).
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.
Notes
The model is given by
\[y_{it}=\beta^{\prime}x_{it}+\epsilon_{it}\]
fit
(*, cov_type='unadjusted', debiased=True, **cov_config)[source]¶ Estimate model parameters
 Parameters
 Returns
results – Estimation results
 Return type
Examples
>>> from linearmodels import PooledOLS >>> mod = PooledOLS(y, x) >>> res = mod.fit(cov_type='clustered', cluster_entity=True)
Notes
Four covariance estimators are supported:
‘unadjusted’, ‘homoskedastic’  Assume residual are homoskedastic
‘robust’, ‘heteroskedastic’  Control for heteroskedasticity using White’s estimator
‘clustered`  One or two way clustering. Configuration options are:
clusters
 Input containing containing 1 or 2 variables. Clusters should be integer values, although other types will be coerced to integer values by treating as categorical variablescluster_entity
 Boolean flag indicating to use entity clusterscluster_time
 Boolean indicating to use time clusters
‘kernel’  DriscollKraay HAC estimator. Configurations options are:
kernel
 One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is produces a covariance estimator similar to the NeweyWest covariance estimator.bandwidth
 Bandwidth to use when computing the kernel. If not provided, a naive default is used.
 property
formula
¶ Formula used to construct the model
 classmethod
from_formula
(formula, data, *, weights=None)[source]¶ Create a model from a formula
 Parameters
formula (str) – Formula to transform into model. Conforms to patsy formula rules.
data (arraylike) – Data structure that can be coerced into a PanelData. In most cases, this should be a multiindex DataFrame where the level 0 index contains the entities and the level 1 contains the time.
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.
 Returns
model – Model specified using the formula
 Return type
Notes
Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)
Examples
>>> from linearmodels import PooledOLS >>> mod = PooledOLS.from_formula('y ~ 1 + x1', panel_data) >>> res = mod.fit()
 property
has_constant
¶ Flag indicating the model a constant or implicit constant
 property
not_null
¶ Locations of nonmissing observations

predict
(params, *, exog=None, data=None, eval_env=4)[source]¶ Predict values for additional data
 Parameters
params (arraylike) – Model parameters (nvar by 1)
exog (arraylike) – Exogenous regressors (nobs by nvar)
data (DataFrame) – Values to use when making predictions from a model constructed from a formula
eval_env (int) – Depth of use when evaluating formulas using Patsy.
 Returns
predictions – Fitted values from supplied data and parameters
 Return type
DataFrame
Notes
If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formulaprocessed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters
(clusters)[source]¶ Reformat cluster variables
 Parameters
clusters (arraylike) – Values to use for variance clustering
 Returns
reformatted – Original data with matching axis and observation dropped where missing in the model data.
 Return type
Notes
This is exposed for testing and is not normally needed for estimation
FamaMacBeth¶
 class
FamaMacBeth
(dependent, exog, *, weights=None)[source]¶ Pooled coefficient estimator for panel data
 Parameters
dependent (arraylike) – Dependent (lefthandside) variable (time by entity)
exog (arraylike) – Exogenous or righthandside variables (variable by time by entity).
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.
Notes
The model is given by
\[y_{it}=\beta^{\prime}x_{it}+\epsilon_{it}\]The FamaMacBeth estimator is computed by performing T regressions, one for each time period using all available entity observations. Denote the estimate of the model parameters as \(\hat{\beta}_t\). The reported estimator is then
\[\hat{\beta} = T^{1}\sum_{t=1}^T \hat{\beta}_t\]While the model does not explicitly include timeeffects, the implementation based on regressing all observation in a single time period is “asif” time effects are included.
Parameter inference is made using the set of T parameter estimates with either the standard covariance estimator or a kernelbased covariance, depending on
cov_type
.
fit
(cov_type='unadjusted', debiased=True, **cov_config)[source]¶ Estimate model parameters
 Parameters
 Returns
results – Estimation results
 Return type
Examples
>>> from linearmodels import FamaMacBeth >>> mod = FamaMacBeth(y, x) >>> res = mod.fit(cov_type='kernel', kernel='Parzen')
Notes
Four covariance estimators are supported:
‘unadjusted’, ‘homoskedastic’, ‘robust’, ‘heteroskedastic’  Use the standard covariance estimator of the T parameter estimates.
‘kernel’  HAC estimator. Configurations options are:
kernel
 One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is implements the the NeweyWest covariance estimator.bandwidth
 Bandwidth to use when computing the kernel. If not provided, a naive default is used.
 property
formula
¶ Formula used to construct the model
 classmethod
from_formula
(formula, data, *, weights=None)[source]¶ Create a model from a formula
 Parameters
formula (str) – Formula to transform into model. Conforms to patsy formula rules.
data (arraylike) – Data structure that can be coerced into a PanelData. In most cases, this should be a multiindex DataFrame where the level 0 index contains the entities and the level 1 contains the time.
weights (arraylike, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.
 Returns
model – Model specified using the formula
 Return type
Notes
Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)
Examples
>>> from linearmodels import BetweenOLS >>> mod = FamaMacBeth.from_formula('y ~ 1 + x1', panel_data) >>> res = mod.fit()
 property
has_constant
¶ Flag indicating the model a constant or implicit constant
 property
not_null
¶ Locations of nonmissing observations

predict
(params, *, exog=None, data=None, eval_env=4)¶ Predict values for additional data
 Parameters
params (arraylike) – Model parameters (nvar by 1)
exog (arraylike) – Exogenous regressors (nobs by nvar)
data (DataFrame) – Values to use when making predictions from a model constructed from a formula
eval_env (int) – Depth of use when evaluating formulas using Patsy.
 Returns
predictions – Fitted values from supplied data and parameters
 Return type
DataFrame
Notes
If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formulaprocessed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters
(clusters)¶ Reformat cluster variables
 Parameters
clusters (arraylike) – Values to use for variance clustering
 Returns
reformatted – Original data with matching axis and observation dropped where missing in the model data.
 Return type
Notes
This is exposed for testing and is not normally needed for estimation