Models for Panel Data

Fixed Effect Estimation

class PanelOLS(dependent, exog, *, weights=None, entity_effects=False, time_effects=False, other_effects=None, singletons=True, drop_absorbed=False)[source]

One- and two-way fixed effects estimator for panel data

Parameters
  • dependent (array-like) – Dependent (left-hand-side) variable (time by entity).

  • exog (array-like) – Exogenous or right-hand-side variables (variable by time by entity).

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.

  • entity_effects (bool, optional) – Flag whether to include entity (fixed) effects in the model

  • time_effects (bool, optional) – Flag whether to include time effects in the model

  • other_effects (array-like, optional) – Category codes to use for any effects that are not entity or time effects. Each variable is treated as an effect.

  • singletons (bool, optional) – Flag indicating whether to drop singleton observation

  • drop_absorbed (bool, optional) – Flag indicating whether to drop absorbed variables

Notes

Many models can be estimated. The most common included entity effects and can be described

\[y_{it} = \alpha_i + \beta^{\prime}x_{it} + \epsilon_{it}\]

where \(\alpha_i\) is included if entity_effects=True.

Time effect are also supported, which leads to a model of the form

\[y_{it}= \gamma_t + \beta^{\prime}x_{it} + \epsilon_{it}\]

where \(\gamma_i\) is included if time_effects=True.

Both effects can be simultaneously used,

\[y_{it}=\alpha_i + \gamma_t + \beta^{\prime}x_{it} + \epsilon_{it}\]

Additionally , arbitrary effects can be specified using categorical variables.

If both entity_effect and``time_effects`` are False, and no other effects are included, the model reduces to PooledOLS.

Model supports at most 2 effects. These can be entity-time, entity-other, time-other or 2 other.

property entity_effects

Flag indicating whether entity effects are included

fit(*, use_lsdv=False, use_lsmr=False, low_memory=None, cov_type='unadjusted', debiased=True, auto_df=True, count_effects=True, **cov_config)[source]

Estimate model parameters

Parameters
  • use_lsdv (bool, optional) – Flag indicating to use the Least Squares Dummy Variable estimator to eliminate effects. The default value uses only means and does note require constructing dummy variables for each effect.

  • use_lsmr (bool, optional) – Flag indicating to use LSDV with the Sparse Equations and Least Squares estimator to eliminate the fixed effects.

  • low_memory ({bool, None}) – Flag indicating whether to use a low-memory algorithm when a model contains two-way fixed effects. If None, the choice is taken automatically, and the low memory algorithm is used if the required dummy variable array is both larger than then array of regressors in the model and requires more than 1 GiB .

  • cov_type (str, optional) – Name of covariance estimator. See Notes.

  • debiased (bool, optional) – Flag indicating whether to debiased the covariance estimator using a degree of freedom adjustment.

  • auto_df (bool, optional) – Flag indicating that the treatment of estimated effects in degree of freedom adjustment is automatically handled. This is useful since clustered standard errors that are clustered using the same variable as an effect do not require degree of freedom correction while other estimators such as the unadjusted covariance do.

  • count_effects (bool, optional) – Flag indicating that the covariance estimator should be adjusted to account for the estimation of effects in the model. Only used if auto_df=False.

  • **cov_config – Additional covariance-specific options. See Notes.

Returns

results – Estimation results

Return type

PanelEffectsResults

Examples

>>> from linearmodels import PanelOLS
>>> mod = PanelOLS(y, x, entity_effects=True)
>>> res = mod.fit(cov_type='clustered', cluster_entity=True)

Notes

Three covariance estimators are supported:

  • ‘unadjusted’, ‘homoskedastic’ - Assume residual are homoskedastic

  • ‘robust’, ‘heteroskedastic’ - Control for heteroskedasticity using White’s estimator

  • ‘clustered` - One or two way clustering. Configuration options are:

    • clusters - Input containing containing 1 or 2 variables. Clusters should be integer valued, although other types will be coerced to integer values by treating as categorical variables

    • cluster_entity - Boolean flag indicating to use entity clusters

    • cluster_time - Boolean indicating to use time clusters

  • ‘kernel’ - Driscoll-Kraay HAC estimator. Configurations options are:

    • kernel - One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is produces a covariance estimator similar to the Newey-West covariance estimator.

    • bandwidth - Bandwidth to use when computing the kernel. If not provided, a naive default is used.

property formula

Formula used to construct the model

classmethod from_formula(formula, data, *, weights=None, other_effects=None, singletons=True, drop_absorbed=False)[source]

Create a model from a formula

Parameters
  • formula (str) – Formula to transform into model. Conforms to patsy formula rules with two special variable names, EntityEffects and TimeEffects which can be used to specify that the model should contain an entity effect or a time effect, respectively. See Examples.

  • data (array-like) – Data structure that can be coerced into a PanelData. In most cases, this should be a multi-index DataFrame where the level 0 index contains the entities and the level 1 contains the time.

  • weights (array-like) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.

  • other_effects (array-like, optional) – Category codes to use for any effects that are not entity or time effects. Each variable is treated as an effect.

  • singletons (bool, optional) – Flag indicating whether to drop singleton observation

  • drop_absorbed (bool, optional) – Flag indicating whether to drop absorbed variables

Returns

model – Model specified using the formula

Return type

PanelOLS

Examples

>>> from linearmodels import PanelOLS
>>> mod = PanelOLS.from_formula('y ~ 1 + x1 + EntityEffects', panel_data)
>>> res = mod.fit(cov_type='clustered', cluster_entity=True)
property has_constant

Flag indicating the model a constant or implicit constant

property not_null

Locations of non-missing observations

property other_effects

Flag indicating whether other (generic) effects are included

predict(params, *, exog=None, data=None, eval_env=4)

Predict values for additional data

Parameters
  • params (array-like) – Model parameters (nvar by 1)

  • exog (array-like) – Exogenous regressors (nobs by nvar)

  • data (DataFrame) – Values to use when making predictions from a model constructed from a formula

  • eval_env (int) – Depth of use when evaluating formulas using Patsy.

Returns

predictions – Fitted values from supplied data and parameters

Return type

DataFrame

Notes

If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formula-processed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters(clusters)

Reformat cluster variables

Parameters

clusters (array-like) – Values to use for variance clustering

Returns

reformatted – Original data with matching axis and observation dropped where missing in the model data.

Return type

PanelData

Notes

This is exposed for testing and is not normally needed for estimation

property time_effects

Flag indicating whether time effects are included

Random Effects

class RandomEffects(dependent, exog, *, weights=None)[source]

One-way Random Effects model for panel data

Parameters
  • dependent (array-like) – Dependent (left-hand-side) variable (time by entity)

  • exog (array-like) – Exogenous or right-hand-side variables (variable by time by entity).

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.

Notes

The model is given by

\[y_{it} = \beta^{\prime}x_{it} + u_i + \epsilon_{it}\]

where \(u_i\) is a shock that is independent of \(x_{it}\) but common to all entities i.

fit(*, small_sample=False, cov_type='unadjusted', debiased=True, **cov_config)[source]

Estimate model parameters

Parameters
  • cov_type (str, optional) – Name of covariance estimator. See Notes.

  • debiased (bool, optional) – Flag indicating whether to debiased the covariance estimator using a degree of freedom adjustment.

  • **cov_config – Additional covariance-specific options. See Notes.

Returns

results – Estimation results

Return type

PanelResults

Examples

>>> from linearmodels import PooledOLS
>>> mod = PooledOLS(y, x)
>>> res = mod.fit(cov_type='clustered', cluster_entity=True)

Notes

Four covariance estimators are supported:

  • ‘unadjusted’, ‘homoskedastic’ - Assume residual are homoskedastic

  • ‘robust’, ‘heteroskedastic’ - Control for heteroskedasticity using White’s estimator

  • ‘clustered` - One or two way clustering. Configuration options are:

    • clusters - Input containing containing 1 or 2 variables. Clusters should be integer values, although other types will be coerced to integer values by treating as categorical variables

    • cluster_entity - Boolean flag indicating to use entity clusters

    • cluster_time - Boolean indicating to use time clusters

  • ‘kernel’ - Driscoll-Kraay HAC estimator. Configurations options are:

    • kernel - One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is produces a covariance estimator similar to the Newey-West covariance estimator.

    • bandwidth - Bandwidth to use when computing the kernel. If not provided, a naive default is used.

property formula

Formula used to construct the model

classmethod from_formula(formula, data, *, weights=None)[source]

Create a model from a formula

Parameters
  • formula (str) – Formula to transform into model. Conforms to patsy formula rules.

  • data (array-like) – Data structure that can be coerced into a PanelData. In most cases, this should be a multi-index DataFrame where the level 0 index contains the entities and the level 1 contains the time.

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.

Returns

model – Model specified using the formula

Return type

RandomEffects

Notes

Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)

Examples

>>> from linearmodels import RandomEffects
>>> mod = RandomEffects.from_formula('y ~ 1 + x1', panel_data)
>>> res = mod.fit()
property has_constant

Flag indicating the model a constant or implicit constant

property not_null

Locations of non-missing observations

predict(params, *, exog=None, data=None, eval_env=4)

Predict values for additional data

Parameters
  • params (array-like) – Model parameters (nvar by 1)

  • exog (array-like) – Exogenous regressors (nobs by nvar)

  • data (DataFrame) – Values to use when making predictions from a model constructed from a formula

  • eval_env (int) – Depth of use when evaluating formulas using Patsy.

Returns

predictions – Fitted values from supplied data and parameters

Return type

DataFrame

Notes

If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formula-processed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters(clusters)

Reformat cluster variables

Parameters

clusters (array-like) – Values to use for variance clustering

Returns

reformatted – Original data with matching axis and observation dropped where missing in the model data.

Return type

PanelData

Notes

This is exposed for testing and is not normally needed for estimation

Between OLS

class BetweenOLS(dependent, exog, *, weights=None)[source]

Between estimator for panel data

Parameters
  • dependent (array-like) – Dependent (left-hand-side) variable (time by entity)

  • exog (array-like) – Exogenous or right-hand-side variables (variable by time by entity).

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.

Notes

The model is given by

\[\bar{y}_{i}= \beta^{\prime}\bar{x}_{i}+\bar{\epsilon}_{i}\]

where \(\bar{z}\) is the time-average.

fit(*, reweight=False, cov_type='unadjusted', debiased=True, **cov_config)[source]

Estimate model parameters

Parameters
  • reweight (bool) – Flag indicating to reweight observations if the input data is unbalanced using a WLS estimator. If weights are provided, these are accounted for when reweighting. Has no effect on balanced data.

  • cov_type (str, optional) – Name of covariance estimator. See Notes.

  • debiased (bool, optional) – Flag indicating whether to debiased the covariance estimator using a degree of freedom adjustment.

  • **cov_config – Additional covariance-specific options. See Notes.

Returns

results – Estimation results

Return type

PanelResults

Examples

>>> from linearmodels import BetweenOLS
>>> mod = BetweenOLS(y, x)
>>> res = mod.fit(cov_type='robust')

Notes

Three covariance estimators are supported:

  • ‘unadjusted’, ‘homoskedastic’ - Assume residual are homoskedastic

  • ‘robust’, ‘heteroskedastic’ - Control for heteroskedasticity using White’s estimator

  • ‘clustered` - One or two way clustering. Configuration options are:

    • clusters - Input containing containing 1 or 2 variables. Clusters should be integer values, although other types will be coerced to integer values by treating as categorical variables

When using a clustered covariance estimator, all cluster ids must be identical within an entity.

property formula

Formula used to construct the model

classmethod from_formula(formula, data, *, weights=None)[source]

Create a model from a formula

Parameters
  • formula (str) – Formula to transform into model. Conforms to patsy formula rules.

  • data (array-like) – Data structure that can be coerced into a PanelData. In most cases, this should be a multi-index DataFrame where the level 0 index contains the entities and the level 1 contains the time.

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.

Returns

model – Model specified using the formula

Return type

BetweenOLS

Notes

Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)

Examples

>>> from linearmodels import BetweenOLS
>>> mod = BetweenOLS.from_formula('y ~ 1 + x1', panel_data)
>>> res = mod.fit()
property has_constant

Flag indicating the model a constant or implicit constant

property not_null

Locations of non-missing observations

predict(params, *, exog=None, data=None, eval_env=4)

Predict values for additional data

Parameters
  • params (array-like) – Model parameters (nvar by 1)

  • exog (array-like) – Exogenous regressors (nobs by nvar)

  • data (DataFrame) – Values to use when making predictions from a model constructed from a formula

  • eval_env (int) – Depth of use when evaluating formulas using Patsy.

Returns

predictions – Fitted values from supplied data and parameters

Return type

DataFrame

Notes

If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formula-processed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters(clusters)

Reformat cluster variables

Parameters

clusters (array-like) – Values to use for variance clustering

Returns

reformatted – Original data with matching axis and observation dropped where missing in the model data.

Return type

PanelData

Notes

This is exposed for testing and is not normally needed for estimation

First Difference Estimation

class FirstDifferenceOLS(dependent, exog, *, weights=None)[source]

First difference model for panel data

Parameters
  • dependent (array-like) – Dependent (left-hand-side) variable (time by entity)

  • exog (array-like) – Exogenous or right-hand-side variables (variable by time by entity).

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.

Notes

The model is given by

\[\Delta y_{it}=\beta^{\prime}\Delta x_{it}+\Delta\epsilon_{it}\]
fit(*, cov_type='unadjusted', debiased=True, **cov_config)[source]

Estimate model parameters

Parameters
  • cov_type (str, optional) – Name of covariance estimator. See Notes.

  • debiased (bool, optional) – Flag indicating whether to debiased the covariance estimator using a degree of freedom adjustment.

  • **cov_config – Additional covariance-specific options. See Notes.

Returns

results – Estimation results

Return type

PanelResults

Examples

>>> from linearmodels import FirstDifferenceOLS
>>> mod = FirstDifferenceOLS(y, x)
>>> res = mod.fit(cov_type='robust')
>>> res = mod.fit(cov_type='clustered', cluster_entity=True)

Notes

Three covariance estimators are supported:

  • ‘unadjusted’, ‘homoskedastic’ - Assume residual are homoskedastic

  • ‘robust’, ‘heteroskedastic’ - Control for heteroskedasticity using White’s estimator

  • ‘clustered` - One or two way clustering. Configuration options are:

    • clusters - Input containing containing 1 or 2 variables. Clusters should be integer values, although other types will be coerced to integer values by treating as categorical variables

    • cluster_entity - Boolean flag indicating to use entity clusters

  • ‘kernel’ - Driscoll-Kraay HAC estimator. Configurations options are:

    • kernel - One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is produces a covariance estimator similar to the Newey-West covariance estimator.

    • bandwidth - Bandwidth to use when computing the kernel. If not provided, a naive default is used.

When using a clustered covariance estimator, all cluster ids must be identical within a first difference. In most scenarios, this requires ids to be identical within an entity.

property formula

Formula used to construct the model

classmethod from_formula(formula, data, *, weights=None)[source]

Create a model from a formula

Parameters
  • formula (str) – Formula to transform into model. Conforms to patsy formula rules.

  • data (array-like) – Data structure that can be coerced into a PanelData. In most cases, this should be a multi-index DataFrame where the level 0 index contains the entities and the level 1 contains the time.

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.

Returns

model – Model specified using the formula

Return type

FirstDifferenceOLS

Notes

Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)

Examples

>>> from linearmodels import FirstDifferenceOLS
>>> mod = FirstDifferenceOLS.from_formula('y ~ x1', panel_data)
>>> res = mod.fit()
property has_constant

Flag indicating the model a constant or implicit constant

property not_null

Locations of non-missing observations

predict(params, *, exog=None, data=None, eval_env=4)

Predict values for additional data

Parameters
  • params (array-like) – Model parameters (nvar by 1)

  • exog (array-like) – Exogenous regressors (nobs by nvar)

  • data (DataFrame) – Values to use when making predictions from a model constructed from a formula

  • eval_env (int) – Depth of use when evaluating formulas using Patsy.

Returns

predictions – Fitted values from supplied data and parameters

Return type

DataFrame

Notes

If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formula-processed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters(clusters)

Reformat cluster variables

Parameters

clusters (array-like) – Values to use for variance clustering

Returns

reformatted – Original data with matching axis and observation dropped where missing in the model data.

Return type

PanelData

Notes

This is exposed for testing and is not normally needed for estimation

Pooled OLS

class PooledOLS(dependent, exog, *, weights=None)[source]

Pooled coefficient estimator for panel data

Parameters
  • dependent (array-like) – Dependent (left-hand-side) variable (time by entity)

  • exog (array-like) – Exogenous or right-hand-side variables (variable by time by entity).

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.

Notes

The model is given by

\[y_{it}=\beta^{\prime}x_{it}+\epsilon_{it}\]
fit(*, cov_type='unadjusted', debiased=True, **cov_config)[source]

Estimate model parameters

Parameters
  • cov_type (str, optional) – Name of covariance estimator. See Notes.

  • debiased (bool, optional) – Flag indicating whether to debiased the covariance estimator using a degree of freedom adjustment.

  • **cov_config – Additional covariance-specific options. See Notes.

Returns

results – Estimation results

Return type

PanelResults

Examples

>>> from linearmodels import PooledOLS
>>> mod = PooledOLS(y, x)
>>> res = mod.fit(cov_type='clustered', cluster_entity=True)

Notes

Four covariance estimators are supported:

  • ‘unadjusted’, ‘homoskedastic’ - Assume residual are homoskedastic

  • ‘robust’, ‘heteroskedastic’ - Control for heteroskedasticity using White’s estimator

  • ‘clustered` - One or two way clustering. Configuration options are:

    • clusters - Input containing containing 1 or 2 variables. Clusters should be integer values, although other types will be coerced to integer values by treating as categorical variables

    • cluster_entity - Boolean flag indicating to use entity clusters

    • cluster_time - Boolean indicating to use time clusters

  • ‘kernel’ - Driscoll-Kraay HAC estimator. Configurations options are:

    • kernel - One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is produces a covariance estimator similar to the Newey-West covariance estimator.

    • bandwidth - Bandwidth to use when computing the kernel. If not provided, a naive default is used.

property formula

Formula used to construct the model

classmethod from_formula(formula, data, *, weights=None)[source]

Create a model from a formula

Parameters
  • formula (str) – Formula to transform into model. Conforms to patsy formula rules.

  • data (array-like) – Data structure that can be coerced into a PanelData. In most cases, this should be a multi-index DataFrame where the level 0 index contains the entities and the level 1 contains the time.

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.

Returns

model – Model specified using the formula

Return type

PooledOLS

Notes

Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)

Examples

>>> from linearmodels import PooledOLS
>>> mod = PooledOLS.from_formula('y ~ 1 + x1', panel_data)
>>> res = mod.fit()
property has_constant

Flag indicating the model a constant or implicit constant

property not_null

Locations of non-missing observations

predict(params, *, exog=None, data=None, eval_env=4)[source]

Predict values for additional data

Parameters
  • params (array-like) – Model parameters (nvar by 1)

  • exog (array-like) – Exogenous regressors (nobs by nvar)

  • data (DataFrame) – Values to use when making predictions from a model constructed from a formula

  • eval_env (int) – Depth of use when evaluating formulas using Patsy.

Returns

predictions – Fitted values from supplied data and parameters

Return type

DataFrame

Notes

If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formula-processed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters(clusters)[source]

Reformat cluster variables

Parameters

clusters (array-like) – Values to use for variance clustering

Returns

reformatted – Original data with matching axis and observation dropped where missing in the model data.

Return type

PanelData

Notes

This is exposed for testing and is not normally needed for estimation

Fama-MacBeth

class FamaMacBeth(dependent, exog, *, weights=None)[source]

Pooled coefficient estimator for panel data

Parameters
  • dependent (array-like) – Dependent (left-hand-side) variable (time by entity)

  • exog (array-like) – Exogenous or right-hand-side variables (variable by time by entity).

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual time the weight should be homoskedastic.

Notes

The model is given by

\[y_{it}=\beta^{\prime}x_{it}+\epsilon_{it}\]

The Fama-MacBeth estimator is computed by performing T regressions, one for each time period using all available entity observations. Denote the estimate of the model parameters as \(\hat{\beta}_t\). The reported estimator is then

\[\hat{\beta} = T^{-1}\sum_{t=1}^T \hat{\beta}_t\]

While the model does not explicitly include time-effects, the implementation based on regressing all observation in a single time period is “as-if” time effects are included.

Parameter inference is made using the set of T parameter estimates with either the standard covariance estimator or a kernel-based covariance, depending on cov_type.

fit(cov_type='unadjusted', debiased=True, **cov_config)[source]

Estimate model parameters

Parameters
  • cov_type (str, optional) – Name of covariance estimator. See Notes.

  • debiased (bool, optional) – Flag indicating whether to debiased the covariance estimator using a degree of freedom adjustment.

  • **cov_config – Additional covariance-specific options. See Notes.

Returns

results – Estimation results

Return type

PanelResults

Examples

>>> from linearmodels import FamaMacBeth
>>> mod = FamaMacBeth(y, x)
>>> res = mod.fit(cov_type='kernel', kernel='Parzen')

Notes

Four covariance estimators are supported:

  • ‘unadjusted’, ‘homoskedastic’, ‘robust’, ‘heteroskedastic’ - Use the standard covariance estimator of the T parameter estimates.

  • ‘kernel’ - HAC estimator. Configurations options are:

    • kernel - One of the supported kernels (bartlett, parzen, qs). Default is Bartlett’s kernel, which is implements the the Newey-West covariance estimator.

    • bandwidth - Bandwidth to use when computing the kernel. If not provided, a naive default is used.

property formula

Formula used to construct the model

classmethod from_formula(formula, data, *, weights=None)[source]

Create a model from a formula

Parameters
  • formula (str) – Formula to transform into model. Conforms to patsy formula rules.

  • data (array-like) – Data structure that can be coerced into a PanelData. In most cases, this should be a multi-index DataFrame where the level 0 index contains the entities and the level 1 contains the time.

  • weights (array-like, optional) – Weights to use in estimation. Assumes residual variance is proportional to inverse of weight to that the residual times the weight should be homoskedastic.

Returns

model – Model specified using the formula

Return type

FamaMacBeth

Notes

Unlike standard patsy, it is necessary to explicitly include a constant using the constant indicator (1)

Examples

>>> from linearmodels import BetweenOLS
>>> mod = FamaMacBeth.from_formula('y ~ 1 + x1', panel_data)
>>> res = mod.fit()
property has_constant

Flag indicating the model a constant or implicit constant

property not_null

Locations of non-missing observations

predict(params, *, exog=None, data=None, eval_env=4)

Predict values for additional data

Parameters
  • params (array-like) – Model parameters (nvar by 1)

  • exog (array-like) – Exogenous regressors (nobs by nvar)

  • data (DataFrame) – Values to use when making predictions from a model constructed from a formula

  • eval_env (int) – Depth of use when evaluating formulas using Patsy.

Returns

predictions – Fitted values from supplied data and parameters

Return type

DataFrame

Notes

If data is not None, then exog must be None. Predictions from models constructed using formulas can be computed using either exog, which will treat these are arrays of values corresponding to the formula-processed data, or using data which will be processed using the formula used to construct the values corresponding to the original model specification.

reformat_clusters(clusters)

Reformat cluster variables

Parameters

clusters (array-like) – Values to use for variance clustering

Returns

reformatted – Original data with matching axis and observation dropped where missing in the model data.

Return type

PanelData

Notes

This is exposed for testing and is not normally needed for estimation